[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784611#comment-16784611 ] Ajith S edited comment on SPARK-26602 at 3/5/19 4:15 PM: - # I have a question about this issue in thrift-server case. If admin does a add jar with a non-existing jar (may be a human error), it will cause all the ongoing beeline sessions to fail ( even a query where jar is not needed at all). and only way to recover is restart of thrift-server # As you said, "If a user adds something to the classpath, it matters to the whole classpath. If it's missing, I think it's surprising to ignore that fact" - but unless the user refers to the jar, is it ok to fail all of his operations.? (just like JVM behaviour, we get classnotfoundexception when the missing class is actually referred, until then JVM is happily running) Please correct me if i am wrong cc [~srowen] was (Author: ajithshetty): # I have a question about this issue in thrift-server case. If admin does a add jar with a non-existing jar (may be a human error), it will cause all the ongoing beeline sessions to fail ( even a query where jar is not needed at all). and only way to recover is restart of thrift-server # As you said, "If a user adds something to the classpath, it matters to the whole classpath. If it's missing, I think it's surprising to ignore that fact" - but unless the user refers to the jar, is it ok to fail all of his operations.? (just like JVM behaviour) Please correct me if i am wrong cc [~srowen] > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784134#comment-16784134 ] Chakravarthi edited comment on SPARK-26602 at 3/5/19 2:31 PM: -- Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue is,Insert into table fails after querying the UDF which is loaded with wrong hdfs path. Below are the steps to reproduce this issue: 1) create a table. sql("create table table1(I int)"); 2) create udf using invalid hdfs path. sql("CREATE FUNCTION before_fix AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 'hdfs:///tmp/notexist.jar'") 3) Do select on the UDF and you will get exception as "Failed to read external resource". sql(" select before_fix('2018-03-09')"). 4) perform insert table or select on any table.It will fail. sql("insert into table1 values(1)").show sql("select * from table1 ").show Here ,insert should work.but is fails. was (Author: chakravarthi): Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue is,Insert into table fails after querying the UDF which is loaded with wrong hdfs path. Below are the steps to reproduce this issue: 1) create a table. sql("create table table1(I int)"); 2) create udf using invalid hdfs path. sql("CREATE FUNCTION before_fix AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 'hdfs:///tmp/notexist.jar'") 3) Do select on the UDF and you will get exception as "Failed to read external resource". sql(" select before_fix('2018-03-09')"). 4) perform insert table. sql("insert into table1 values(1)").show Here ,insert should work.but is fails. > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784134#comment-16784134 ] Chakravarthi edited comment on SPARK-26602 at 3/5/19 1:52 PM: -- Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue is,Insert into table fails after querying the UDF which is loaded with wrong hdfs path. Below are the steps to reproduce this issue: 1) create a table. sql("create table table1(I int)"); 2) create udf using invalid hdfs path. sql("CREATE FUNCTION before_fix AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 'hdfs:///tmp/notexist.jar'") 3) Do select on the UDF and you will get exception as "Failed to read external resource". sql(" select before_fix('2018-03-09')"). 4) perform insert table. sql("insert into table1 values(1)").show Here ,insert should work.but is fails. was (Author: chakravarthi): Hi [~srowen] , this issue is not duplicate of SPARK-26560. Here the issue is,Insert into table fails after querying the UDF which is loaded with wrong hdfs path. Below are the steps to reproduce this issue: 1) create a table. sql("create table check_udf(I int)"); 2) create udf using invalid hdfs path. sql("CREATE FUNCTION before_fix AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFLastDayTest' USING JAR 'hdfs:///tmp/notexist.jar'") 3) Do select on the UDF and you will get exception as "Failed to read external resource". sql(" select before_fix('2018-03-09')"). 4) perform insert table. sql("insert into check_udf values(1)").show Here ,insert should work.but is fails. > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org