[ https://issues.apache.org/jira/browse/IMPALA-9429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-9429 started by Kurt Deschler. --------------------------------------------- > Unioned partition columns break partition pruning > ------------------------------------------------- > > Key: IMPALA-9429 > URL: https://issues.apache.org/jira/browse/IMPALA-9429 > Project: IMPALA > Issue Type: Bug > Components: Planner > Affects Versions: Impala 3.2.0 > Reporter: Max Mizikar > Assignee: Kurt Deschler > Priority: Critical > > We have different granularity of partitions on our landing tables vs our > compacted tables. We use a view to union our landing and our compacted. After > an upgrade from cdh5.15 (Impala v2.12.0) to cdh6.3 (Impala 3.2.0) we started > having issues with our union-ed tables. I've come up with this as the > smallest breaking example. > {code:java} > [:21000] debug> create table debug_with_partition (col1 int) partitioned by > (col2 int, col3 int); > > > Query: create table debug_with_partition (col1 int) partitioned by (col2 int, > col3 int) > +-------------------------+ > | summary | > +-------------------------+ > | Table has been created. | > +-------------------------+ > Fetched 1 row(s) in 0.09s > [:21000] debug> create table debug_without_partition (col1 int) partitioned > by (col2 int); > > > Query: create table debug_without_partition (col1 int) partitioned by (col2 > int) > +-------------------------+ > | summary | > +-------------------------+ > | Table has been created. | > +-------------------------+ > Fetched 1 row(s) in 0.03s > [:21000] debug> create view debug as select col1, col2, col3 from > debug_with_partition union all select col1, col2, null from > debug_without_partition; > > > Query: create view debug as select col1, col2, col3 from debug_with_partition > union all select col1, col2, null from debug_without_partition > Query submitted at: 2020-02-26 17:04:58 (Coordinator: :25000) > Query progress can be monitored at: > :25000/query_plan?query_id=28453bdf5f919fe9:66fef22200000000 > +------------------------+ > | summary | > +------------------------+ > | View has been created. | > +------------------------+ > Fetched 1 row(s) in 5.65s > [:21000] debug> select * from debug where col2 = 0 or col3 = 0; > > > > Query: select * from debug where col2 = 0 or col3 = 0 > Query submitted at: 2020-02-26 17:05:21 (Coordinator: t:25000) > ERROR: IllegalStateException: null > {code} > Here is what I find in the log > {code:java} > I0226 17:05:21.099532 129442 jni-util.cc:256] > c34e2a72018579fe:3d7388e100000000] java.lang.IllegalStateException > at > com.google.common.base.Preconditions.checkState(Preconditions.java:133) > at > org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:196) > at > org.apache.impala.planner.HdfsPartitionPruner.canEvalUsingPartitionMd(HdfsPartitionPruner.java:211) > at > org.apache.impala.planner.HdfsPartitionPruner.prunePartitions(HdfsPartitionPruner.java:131) > at > org.apache.impala.planner.SingleNodePlanner.createHdfsScanPlan(SingleNodePlanner.java:1257) > at > org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1348) > at > org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1535) > at > org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814) > at > org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650) > at > org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258) > at > org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1584) > at > org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1651) > at > org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:280) > at > org.apache.impala.planner.SingleNodePlanner.createInlineViewPlan(SingleNodePlanner.java:1088) > at > org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1546) > at > org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:814) > at > org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:650) > at > org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:258) > at > org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:148) > at > org.apache.impala.planner.Planner.createPlan(Planner.java:103) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1171) > at > org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1466) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1345) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1252) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1222) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:167) > I0226 17:05:21.099617 129442 status.cc:124] > c34e2a72018579fe:3d7388e100000000] IllegalStateException: null > @ 0xb4c459 > @ 0x114fe2e > @ 0x102ab53 > @ 0x1052ba2 > @ 0x105e88c > @ 0x109e5be > @ 0x138fee4 > @ 0x138f39c > @ 0xb18169 > @ 0xf2d1d8 > @ 0xf23c4e > @ 0xf24ae1 > @ 0x11c5e0f > @ 0x11c69b9 > @ 0x1840569 > @ 0x7f2ef82926b9 > @ 0x7f2ef7fc841c > {code} > I've done some level of debugging from the shell and I find that the > following things work > querying just on the null filled column > {code:java} > [:21000] debug> select * from debug where col3 = 0; > Query: select * from debug where col3 = 0 > Query submitted at: 2020-02-26 17:07:07 (Coordinator: :25000) > Query progress can be monitored at: > :25000/query_plan?query_id=1b44157731b6f5ff:d052c2c600000000 > Fetched 0 row(s) in 0.11s > {code} > query with an and on the null filled column > {code:java} > [:21000] debug> select * from debug where col2 = 0 and col3 = 0; > Query: select * from debug where col2 = 0 and col3 = 0 > Query submitted at: 2020-02-26 17:07:27 (Coordinator: :25000) > Query progress can be monitored at: > :25000/query_plan?query_id=334f7fbf2367a558:6ebe4d6100000000 > Fetched 0 row(s) in 0.11s > {code} > casting the null filled column > {code:java} > [:21000] debug> select * from debug where col2 = 0 or cast(col3 as int) = 0; > Query: select * from debug where col2 = 0 or cast(col3 as int) = 0 > Query submitted at: 2020-02-26 17:08:26 (Coordinator: :25000) > Query progress can be monitored at: > :25000/query_plan?query_id=1a4d43d8fc9fc45d:662922b900000000 > Fetched 0 row(s) in 0.11s > {code} > Please let me know if there is anything else I can do to help! -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org