Josh Spiegel created HIVE-4567:
----------------------------------
Summary: RuntimeException when two external tables have the same
location
Key: HIVE-4567
URL: https://issues.apache.org/jira/browse/HIVE-4567
Project: Hive
Issue Type: Bug
Affects Versions: 0.10.0, 0.7.1
Reporter: Josh Spiegel
I am working with a custom InputFormat and a custom SerDe where it sometimes
makes sense to have two external tables with different schemas and properties
but the same location. When such tables are used in the same query, a
RuntimeException may occur.
I realize that with Hive's built-in adapters, it may not ever be useful to
create two external tables with the same location. The following example is
nonsensical but it can be used to easily reproduce the error:
CREATE EXTERNAL TABLE f (fk STRING, name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
LOCATION '/local/data/';
CREATE EXTERNAL TABLE IF NOT EXISTS p (pk STRING)
LOCATION '/local/data/';
SELECT p.pk
FROM p LEFT OUTER JOIN f
ON p.pk = f.fk;
In /local/data, put file data.txt:
k1 apple
k2 orange
k2 pear
Produces the folllowing error:
Caused by: java.lang.RuntimeException: cannot find field fk from [0:pk]
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:346)
at
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
...
Ideally this should be supported by Hive as it is useful for semi-structured
documents (e.g. JSON, XML) where multiple big "relations" may be contained in
the same file. However, if adding support is infeasible, it would be nice to
detect this condition statically and raise a more meaningful error from the
client process.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira