[ https://issues.apache.org/jira/browse/HIVE-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376402#comment-16376402 ]
Anthony Hsu edited comment on HIVE-18695 at 2/26/18 5:34 AM: ------------------------------------------------------------- Hi [~kgyrtkirk], I investigated this further and the change in HIVE-15680 that broke accumulo_queries.q is {noformat} // disable filter pushdown for mapreduce when there are more than one table aliases, // since we don't clone jobConf per alias if (mrwork != null && mrwork.getAliases() != null && mrwork.getAliases().size() > 1 && jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) { return; }{noformat} In the case of the Accumulo CliDriver test, the execution engine is set to "mr", so the "return" here is triggered, and then the subsequent code that sets the filter expressions on the TableScanDesc is not triggered. However, though removing the above fixes the test, I found a more serious problem that exists with or without the above. If the same Accumulo table is referenced multiple times in the same query, you get very strange results. Here's an example: {noformat} DROP TABLE accumulo_test; CREATE TABLE accumulo_test(key int, value int) STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string") TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0"); INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3); SELECT * from accumulo_test where key == 1 union all select * from accumulo_test where key == 2;{noformat} The expected output is {code:java} 1 1 2 2{code} but the actual output is {code:java} 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 {code} I'll file a separate ticket for this issue. I think a fix for this issue would also fix HIVE-15680, but for now, you can revert HIVE-15680. was (Author: erwaman): Hi [~kgyrtkirk], I investigated this further and the change in HIVE-15680 that broke accumulo_queries.q is {noformat} // disable filter pushdown for mapreduce when there are more than one table aliases, // since we don't clone jobConf per alias if (mrwork != null && mrwork.getAliases() != null && mrwork.getAliases().size() > 1 && jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) { return; }{noformat} In the case of the Accumulo CliDriver test, the execution engine is set to "mr", so the "return" here is triggered, and then the subsequent code that sets the filter expressions on the TableScanDesc is not triggered. However, though removing the above fixes the test, I found a more serious problem that still remains. If the same Accumulo table is referenced multiple times in the same query, you get very strange results. Here's an example: {noformat} DROP TABLE accumulo_test; CREATE TABLE accumulo_test(key int, value int) STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string") TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0"); INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3); SELECT * from accumulo_test where key == 1 union all select * from accumulo_test where key == 2;{noformat} The expected output is {code:java} 1 1 2 2{code} but the actual output is {code:java} 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 {code} I'll file a separate ticket for this issue. I think a fix for this issue would also fix HIVE-15680, but for now, you can revert HIVE-15680. > fix TestAccumuloCliDriver.testCliDriver[accumulo_queries] > --------------------------------------------------------- > > Key: HIVE-18695 > URL: https://issues.apache.org/jira/browse/HIVE-18695 > Project: Hive > Issue Type: Bug > Reporter: Zoltan Haindrich > Priority: Major > > seems to be broken by HIVE-15680 -- This message was sent by Atlassian JIRA (v7.6.3#76005)