[jira] [Commented] (SPARK-28672) [UDF] Duplicate function creation should not allow

ABHISHEK KUMAR GUPTA (Jira) Tue, 20 Aug 2019 21:15:48 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-28672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911937#comment-16911937
 ]


ABHISHEK KUMAR GUPTA commented on SPARK-28672:
----------------------------------------------

Thanks for update.
This JIRA is for not allowing duplicate function name.
Hive works as below:
It does not list the permanent function 
: jdbc:hive2://10.18.98.147:21066/> create function mul3  AS 
'com.huawei.bigdata.hive.example.udf.multiply'  using jar 
'hdfs://hacluster/user/Multiply.jar';
INFO  : Compiling 
command(queryId=omm_20190821115530_03819dba-4c28-46c9-92a6-461cc2762f94): 
create function mul3  AS 'com.huawei.bigdata.hive.example.udf.multiply'  using 
jar 'hdfs://hacluster/user/Multiply.jar'--0; Current 
sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=omm_20190821115530_03819dba-4c28-46c9-92a6-461cc2762f94); Time 
taken: 0.699 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing 
command(queryId=omm_20190821115530_03819dba-4c28-46c9-92a6-461cc2762f94): 
create function mul3  AS 'com.huawei.bigdata.hive.example.udf.multiply'  using 
jar 'hdfs://hacluster/user/Multiply.jar'--0; Current 
sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Starting task [Stage-0:FUNC] in serial mode
INFO  : Added 
[/opt/huawei/Bigdata/tmp/hivelocaltmp/session_resources/8d2e1845-5254-4021-935e-4e1beb484a72_resources/Multiply.jar]
 to class path
INFO  : Added resources: [hdfs://hacluster/user/Multiply.jar]
INFO  : Completed executing 
command(queryId=omm_20190821115530_03819dba-4c28-46c9-92a6-461cc2762f94); Time 
taken: 0.043 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
No rows affected (0.785 seconds)
0: jdbc:hive2://10.18.98.147:21066/> create temporary function mul3  AS 
'com.huawei.bigdata.hive.example.udf.multiply'  using jar 
'hdfs://hacluster/user/Multiply.jar';
INFO  : Compiling 
command(queryId=omm_20190821115600_26c3076d-857b-45d8-aef2-00118edbb14e): 
create temporary function mul3  AS 
'com.huawei.bigdata.hive.example.udf.multiply'  using jar 
'hdfs://hacluster/user/Multiply.jar'--0; Current 
sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=omm_20190821115600_26c3076d-857b-45d8-aef2-00118edbb14e); Time 
taken: 0.754 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing 
command(queryId=omm_20190821115600_26c3076d-857b-45d8-aef2-00118edbb14e): 
create temporary function mul3  AS 
'com.huawei.bigdata.hive.example.udf.multiply'  using jar 
'hdfs://hacluster/user/Multiply.jar'--0; Current 
sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Starting task [Stage-0:FUNC] in serial mode
INFO  : Added 
[/opt/huawei/Bigdata/tmp/hivelocaltmp/session_resources/8d2e1845-5254-4021-935e-4e1beb484a72_resources/Multiply.jar]
 to class path
INFO  : Added resources: [hdfs://hacluster/user/Multiply.jar]
INFO  : Completed executing 
command(queryId=omm_20190821115600_26c3076d-857b-45d8-aef2-00118edbb14e); Time 
taken: 0.004 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
No rows affected (0.834 seconds)
0: jdbc:hive2://10.18.98.147:21066/> show functions like mul3;
INFO  : Compiling 
command(queryId=omm_20190821115614_3ae7fdd3-04da-4d7b-8fdd-4a50e22491ca): show 
functions like mul3--0; Current sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, 
type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling 
command(queryId=omm_20190821115614_3ae7fdd3-04da-4d7b-8fdd-4a50e22491ca); Time 
taken: 0.136 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing 
command(queryId=omm_20190821115614_3ae7fdd3-04da-4d7b-8fdd-4a50e22491ca): show 
functions like mul3--0; Current sessionId=8d2e1845-5254-4021-935e-4e1beb484a72
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing 
command(queryId=omm_20190821115614_3ae7fdd3-04da-4d7b-8fdd-4a50e22491ca); Time 
taken: 0.003 seconds
INFO  : OK
INFO  : Concurrency mode is disabled, not creating a lock manager
+-----------+
| tab_name  |
+-----------+
| mul3      |
+-----------+

Now my question is if user creates both permanent and temporary function with 
same name then in current session permanent function get overriden by temporary 
function so end user should be aware with the creation of function with the 
same name.
So solution should be ( My suggestion )
1. Not allowed duplicate function name whether it is temporary or permanent. 
Or
2. Give end user msg saying " Permanent function already exist with the same 
name if user creates then it will be Overriden with temporary function"


> [UDF] Duplicate function creation should not allow 
> ---------------------------------------------------
>
>                 Key: SPARK-28672
>                 URL: https://issues.apache.org/jira/browse/SPARK-28672
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: ABHISHEK KUMAR GUPTA
>            Priority: Minor
>
> {code}
> 0: jdbc:hive2://10.18.18.214:23040/default> create function addm_3  AS 
> 'com.huawei.bigdata.hive.example.udf.multiply' using jar 
> 'hdfs://hacluster/user/Multiply.jar';
> +---------+--+
> | Result  |
> +---------+--+
> +---------+--+
> No rows selected (0.084 seconds)
> {code}
> {code}
> 0: jdbc:hive2://10.18.18.214:23040/default> create temporary function addm_3  
> AS 'com.huawei.bigdata.hive.example.udf.multiply' using jar 
> 'hdfs://hacluster/user/Multiply.jar';
> INFO  : converting to local hdfs://hacluster/user/Multiply.jar
> INFO  : Added 
> [/tmp/8a396308-41f8-4335-9de4-8268ce5c70fe_resources/Multiply.jar] to class 
> path
> INFO  : Added resources: [hdfs://hacluster/user/Multiply.jar]
> +---------+--+
> | Result  |
> +---------+--+
> +---------+--+
> No rows selected (0.134 seconds)
> {code}
> {code}
> 0: jdbc:hive2://10.18.18.214:23040/default> show functions like addm_3;
> +-----------------+--+
> |    function     |
> +-----------------+--+
> | addm_3          |
> | default.addm_3  |
> +-----------------+--+
> 2 rows selected (0.047 seconds)
> {code}
> When show function executed it is listing both the function but what about 
> the db for permanent function when user has not specified.
> Duplicate should not be allowed if user creating temporary one with the same 
> name.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28672) [UDF] Duplicate function creation should not allow

Reply via email to