[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path

2019-03-05 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784611#comment-16784611
 ] 

Ajith S edited comment on SPARK-26602 at 3/5/19 4:15 PM:
-

# I have a question about this issue in thrift-server case. If admin does a add 
jar with a non-existing jar (may be a human error), it will cause all the 
ongoing beeline sessions to fail  ( even a query where jar is not needed at 
all). and only way to recover is restart of thrift-server
 #  As you said, "If a user adds something to the classpath, it matters to the 
whole classpath. If it's missing, I think it's surprising to ignore that fact" 
- but unless the user refers to the jar, is it ok to fail all of his 
operations.? (just like JVM behaviour, we get classnotfoundexception when the 
missing class is actually referred, until then JVM is happily running)

Please correct me if i am wrong  cc [~srowen]


was (Author: ajithshetty):
# I have a question about this issue in thrift-server case. If admin does a add 
jar with a non-existing jar (may be a human error), it will cause all the 
ongoing beeline sessions to fail  ( even a query where jar is not needed at 
all). and only way to recover is restart of thrift-server
 #  As you said, "If a user adds something to the classpath, it matters to the 
whole classpath. If it's missing, I think it's surprising to ignore that fact" 
- but unless the user refers to the jar, is it ok to fail all of his 
operations.? (just like JVM behaviour)

Please correct me if i am wrong  cc [~srowen]

> Insert into table fails after querying the UDF which is loaded with wrong 
> hdfs path
> ---
>
> Key: SPARK-26602
> URL: https://issues.apache.org/jira/browse/SPARK-26602
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Haripriya
>Priority: Major
> Attachments: beforeFixUdf.txt
>
>
> In sql,
> 1.Query the existing  udf(say myFunc1)
> 2. create and select the udf registered with incorrect path (say myFunc2)
> 3.Now again query the existing udf  in the same session - Wil throw exception 
> stating that couldn't read resource of myFunc2's path
> 4.Even  the basic operations like insert and select will fail giving the same 
> error
> Result: 
> java.lang.RuntimeException: Failed to read external resource 
> hdfs:///tmp/hari_notexists1/two_udfs.jar
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
>  at 
> org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
>  at 
> org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path

2019-03-05 Thread Chakravarthi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784134#comment-16784134
 ] 

Chakravarthi edited comment on SPARK-26602 at 3/5/19 2:31 PM:
--

Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue 
is,Insert into table fails after querying the UDF which is loaded with wrong 
hdfs path.

Below are the steps to reproduce this issue:

1) create a table.
sql("create table table1(I int)");

2) create udf using invalid hdfs path.
sql("CREATE FUNCTION before_fix  AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 
'hdfs:///tmp/notexist.jar'")

3) Do select on the UDF  and you will get exception as "Failed to read external 
resource".
 sql(" select  before_fix('2018-03-09')").

4) perform insert table or select on any table.It will fail. 
 sql("insert into  table1 values(1)").show
 sql("select * from table1 ").show

Here ,insert should work.but is fails.











was (Author: chakravarthi):
Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue 
is,Insert into table fails after querying the UDF which is loaded with wrong 
hdfs path.

Below are the steps to reproduce this issue:

1) create a table.
sql("create table table1(I int)");

2) create udf using invalid hdfs path.
sql("CREATE FUNCTION before_fix  AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 
'hdfs:///tmp/notexist.jar'")

3) Do select on the UDF  and you will get exception as "Failed to read external 
resource".
 sql(" select  before_fix('2018-03-09')").

4) perform insert table. 
 sql("insert into  table1 values(1)").show

Here ,insert should work.but is fails.










> Insert into table fails after querying the UDF which is loaded with wrong 
> hdfs path
> ---
>
> Key: SPARK-26602
> URL: https://issues.apache.org/jira/browse/SPARK-26602
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Haripriya
>Priority: Major
> Attachments: beforeFixUdf.txt
>
>
> In sql,
> 1.Query the existing  udf(say myFunc1)
> 2. create and select the udf registered with incorrect path (say myFunc2)
> 3.Now again query the existing udf  in the same session - Wil throw exception 
> stating that couldn't read resource of myFunc2's path
> 4.Even  the basic operations like insert and select will fail giving the same 
> error
> Result: 
> java.lang.RuntimeException: Failed to read external resource 
> hdfs:///tmp/hari_notexists1/two_udfs.jar
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
>  at 
> org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
>  at 
> org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path

2019-03-05 Thread Chakravarthi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784134#comment-16784134
 ] 

Chakravarthi edited comment on SPARK-26602 at 3/5/19 1:52 PM:
--

Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue 
is,Insert into table fails after querying the UDF which is loaded with wrong 
hdfs path.

Below are the steps to reproduce this issue:

1) create a table.
sql("create table table1(I int)");

2) create udf using invalid hdfs path.
sql("CREATE FUNCTION before_fix  AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 
'hdfs:///tmp/notexist.jar'")

3) Do select on the UDF  and you will get exception as "Failed to read external 
resource".
 sql(" select  before_fix('2018-03-09')").

4) perform insert table. 
 sql("insert into  table1 values(1)").show

Here ,insert should work.but is fails.











was (Author: chakravarthi):
Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue 
is,Insert into table fails after querying the UDF which is loaded with wrong 
hdfs path.

Below are the steps to reproduce this issue:

1) create a table.
sql("create table check_udf(I int)");

2) create udf using invalid hdfs path.
sql("CREATE FUNCTION before_fix  AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 
'hdfs:///tmp/notexist.jar'")

3) Do select on the UDF  and you will get exception as "Failed to read external 
resource".
 sql(" select  before_fix('2018-03-09')").

4) perform insert table. 
 sql("insert into  check_udf values(1)").show

Here ,insert should work.but is fails.










> Insert into table fails after querying the UDF which is loaded with wrong 
> hdfs path
> ---
>
> Key: SPARK-26602
> URL: https://issues.apache.org/jira/browse/SPARK-26602
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Haripriya
>Priority: Major
> Attachments: beforeFixUdf.txt
>
>
> In sql,
> 1.Query the existing  udf(say myFunc1)
> 2. create and select the udf registered with incorrect path (say myFunc2)
> 3.Now again query the existing udf  in the same session - Wil throw exception 
> stating that couldn't read resource of myFunc2's path
> 4.Even  the basic operations like insert and select will fail giving the same 
> error
> Result: 
> java.lang.RuntimeException: Failed to read external resource 
> hdfs:///tmp/hari_notexists1/two_udfs.jar
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
>  at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
>  at 
> org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841)
>  at 
> org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org