Can anyone verify that virtual columns (INPUT__FILE__NAME, in particular) are 
not able to be used when map-side joins are enabled?  I'm currently working 
with Hive 0.7.1 and CDH3u3.

Example:
SELECT
    COUNT(*)
FROM table_a --small table that will be placed in Distributed Cache
    JOIN table_b ON
        table_a.key = table_b.key
WHERE
    table_a.INPUT__FILE__NAME LIKE "%ABCD%";

Here is the table definition for my table that will be placed in Distributed 
Cache:

CREATE EXTERNAL TABLE surveys (
    surveyid  STRING,
    date_time STRING,
    visid     STRING
)...

Here is the stack trace that is printed out for my query:

java.lang.RuntimeException: cannot find field input__file__name from 
[0:surveyid, 1:date_time, 2:visid]
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:321)
        at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:146)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:77)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:77)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:77)
        at 
org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:80)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:78)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:313)
        at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:260)
        at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:1087)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapredLocalTask

Thanks!

Matt Tucker

Reply via email to