[
https://issues.apache.org/jira/browse/HIVE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863510#comment-13863510
]
Sushanth Sowmyan commented on HIVE-6115:
----------------------------------------
There are two purposes served - one, to check that hbase-default.xml and
hbase-site.xml are accessible, which HiveHBaseStorageHandler.addHBaseResources
achieves, and the other is to add those as requisite resources for the current
job, which is achieved by the inner call directly to HBaseConfiguration on the
jobconf.
>From a HCat perspective, if I remember correctly, the second is needed to
>setup and ship the job correctly, otherwise we'd wind up fail with errors
>indicating that we're failing not being able to talk to zookeeper or the
>master.
Per your contention, the problem is that if you do have a local override
hbase-site.xml, it still winds up pulling in a default
hbase-default.xml/hbase-site.xml and thus fails? I'm a little confused as to
how this might be a problem, since when those resources are added, they're
added by name, without any associated path, and thus, would need to be present
as resolved in the classpath anyway.
Or I was barking up the wrong tree with that interpretation, and the problem is
the update semantic that HiveHBaseStorageHandler.addHBaseResources takes care
of is abused, and we wind up nuking other conf values by replacing, rather than
strictly updating only for values where the values do not exist. In which case
it makes sense to have a segment there which goes something like this:
{code}
tundra:hive sush$ git diff
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
diff --git
a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
index fc63970..d76abe8 100644
---
a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
+++
b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
@@ -333,7 +333,11 @@ public void configureTableJobProperties(
// check to see if this an input job or an outputjob
if (this.configureInputJobProps) {
try {
- HBaseConfiguration.addHbaseResources(jobConf);
+ for (String k : jobProperties.keySet()){
+ jobConf.set(k, jobProperties.get(k));
+ }
+ jobConf.addResource("hbase-default.xml");
+ jobConf.addResource("hbase-site.xml");
addHBaseDelegationToken(jobConf);
}//try
catch (IOException e) {
{code}
This, then, would be functionally equivalent and satisfy the need for those
resources to be present, and not pollute jobconf with the rest of the
parameters?
This would then, however, be forcing visibility of hbase's internals out onto
here, and looks hacky. What parameters get overridden by hbase's resource
import that should not be overridden? This might be something to fix on
HBaseConfiguration.addHBaseResources' end instead, then.
> Remove redundant code in HiveHBaseStorageHandler
> ------------------------------------------------
>
> Key: HIVE-6115
> URL: https://issues.apache.org/jira/browse/HIVE-6115
> Project: Hive
> Issue Type: Improvement
> Affects Versions: 0.12.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: HIVE-6115.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)