[ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835504#action_12835504
 ] 

Carl Steinbach commented on HIVE-1096:
--------------------------------------

bq. Philosophically I agree. In actuality have Hive/Hadoop conf is easily 
manipulated by changing your hadoop-site.xml or hive-site.xml. Users do have 
unprotected access to the namespace that is the nature of hadoop. Users of hive 
are setting variables all the time.

True, but I think we should try to improve the situation. As a start we can add 
code to throw an error if hive-default.xml or hive-site.xml sets a hive.* 
configuration property that is not defined in HiveConf. This would protect the 
hive.* namespace and at the same time make it easy to track down cases where 
folks misspell a hive.* property name.

bq. The only true difference in implementation is that your doing it with 
properties and I am doing it with HiveConf Vars. If we support both I think we 
are both happy. Any ideas?

I agree that we should support access to both system properties and hiveconf 
properties, but if we do how will we resolve cases where the user references 
{{${foo.bar}}} and both the system and hiveconf define properties named 
foo.bar? Also, another problem I see with using the hiveconf namespace for user 
variable definitions is that user variables cease to have any meaning past the 
client-side query preprocessing step, yet since they're part of the hiveconf 
they will get included in the jobconf and sent to datanodes. 

Here's a proposal:

* Allow users to reference variables in QL statements using the syntax 
{{${namespace:variable_name}}}.
* Users can define variables on the command line using a new "{{-hivevar x=y}}" 
switch. Values defined in this manner become part of the user namespace, which 
is the default namespace. They can be referenced as either 
{{${default:variablename}}} or {{${variablename}}}.
* Hive configuration properties are part of the "hiveconf" namespace, and can 
be referenced as {{${hiveconf:propertyname}}}.
* System properties are part of the "system" namespace, and can be referenced 
as {{${system:property_name}}}.

What do you think?



> Hive Variables
> --------------
>
>                 Key: HIVE-1096
>                 URL: https://issues.apache.org/jira/browse/HIVE-1096
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>         Attachments: 1096-9.diff, hive-1096-2.diff, hive-1096-7.diff, 
> hive-1096-8.diff, hive-1096.diff
>
>
> From mailing list:
> --Amazon Elastic MapReduce version of Hive seems to have a nice feature 
> called "Variables." Basically you can define a variable via command-line 
> while invoking hive with -d DT=2009-12-09 and then refer to the variable via 
> ${DT} within the hive queries. This could be extremely useful. I can't seem 
> to find this feature even on trunk. Is this feature currently anywhere in the 
> roadmap?--
> This could be implemented in many places.
> A simple place to put this is 
> in Driver.compile or Driver.run we can do string substitutions at that level, 
> and further downstream need not be effected. 
> There could be some benefits to doing this further downstream, parser,plan. 
> but based on the simple needs we may not need to overthink this.
> I will get started on implementing in compile unless someone wants to discuss 
> this more.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to