[ 
https://issues.apache.org/jira/browse/HIVE-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376402#comment-16376402
 ] 

Anthony Hsu edited comment on HIVE-18695 at 2/26/18 5:34 AM:
-------------------------------------------------------------

Hi [~kgyrtkirk], I investigated this further and the change in HIVE-15680 that 
broke accumulo_queries.q is
{noformat}
// disable filter pushdown for mapreduce when there are more than one table 
aliases,
// since we don't clone jobConf per alias
if (mrwork != null && mrwork.getAliases() != null && mrwork.getAliases().size() 
> 1 &&
  jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) {
  return;
}{noformat}
In the case of the Accumulo CliDriver test, the execution engine is set to 
"mr", so the "return" here is triggered, and then the subsequent code that sets 
the filter expressions on the TableScanDesc is not triggered.

However, though removing the above fixes the test, I found a more serious 
problem that exists with or without the above. If the same Accumulo table is 
referenced multiple times in the same query, you get very strange results. 
Here's an example:
{noformat}
DROP TABLE accumulo_test;
CREATE TABLE accumulo_test(key int, value int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string")
TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0");

INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3);

SELECT * from accumulo_test where key == 1 union all select * from 
accumulo_test where key == 2;{noformat}
The expected output is
{code:java}
1 1
2 2{code}
but the actual output is
{code:java}
1  0
1  1
1  2
1  3
2  0
2  1
2  2
2  3
{code}
I'll file a separate ticket for this issue. I think a fix for this issue would 
also fix HIVE-15680, but for now, you can revert HIVE-15680.


was (Author: erwaman):
Hi [~kgyrtkirk], I investigated this further and the change in HIVE-15680 that 
broke accumulo_queries.q is
{noformat}
// disable filter pushdown for mapreduce when there are more than one table 
aliases,
// since we don't clone jobConf per alias
if (mrwork != null && mrwork.getAliases() != null && mrwork.getAliases().size() 
> 1 &&
  jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) {
  return;
}{noformat}
In the case of the Accumulo CliDriver test, the execution engine is set to 
"mr", so the "return" here is triggered, and then the subsequent code that sets 
the filter expressions on the TableScanDesc is not triggered.

However, though removing the above fixes the test, I found a more serious 
problem that still remains. If the same Accumulo table is referenced multiple 
times in the same query, you get very strange results. Here's an example:
{noformat}
DROP TABLE accumulo_test;
CREATE TABLE accumulo_test(key int, value int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" = ":rowID,cf:string")
TBLPROPERTIES ("accumulo.table.name" = "accumulo_table_0");

INSERT OVERWRITE TABLE accumulo_test VALUES (0,0), (1,1), (2,2), (3,3);

SELECT * from accumulo_test where key == 1 union all select * from 
accumulo_test where key == 2;{noformat}
The expected output is
{code:java}
1 1
2 2{code}
but the actual output is
{code:java}
1  0
1  1
1  2
1  3
2  0
2  1
2  2
2  3
{code}
I'll file a separate ticket for this issue. I think a fix for this issue would 
also fix HIVE-15680, but for now, you can revert HIVE-15680.

> fix TestAccumuloCliDriver.testCliDriver[accumulo_queries]
> ---------------------------------------------------------
>
>                 Key: HIVE-18695
>                 URL: https://issues.apache.org/jira/browse/HIVE-18695
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zoltan Haindrich
>            Priority: Major
>
> seems to be broken by HIVE-15680



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to