[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784611#comment-16784611 ] Ajith S commented on SPARK-26602: - # I have a question about this issue in thrift-server case. If admin does a add jar with a non-existing jar (may be a human error), it will cause all the ongoing beeline sessions to fail ( even a query where jar is not needed at all). and only way to recover is restart of thrift-server # As you said, "If a user adds something to the classpath, it matters to the whole classpath. If it's missing, I think it's surprising to ignore that fact" - but unless the user refers to the jar, is it ok to fail all of his operations.? (just like JVM behaviour) Please correct me if i am wrong cc [~srowen] > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784528#comment-16784528 ] Sean Owen commented on SPARK-26602: --- If a user adds something to the classpath, it matters to the whole classpath. If it's missing, I think it's surprising to ignore that fact. Something else will fail eventually. I understand you're saying, what if it doesn't affect some other UDFs? but I'm not sure we can know that. I would not make this change. > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784526#comment-16784526 ] Chakravarthi commented on SPARK-26602: -- [~srowen] agree,but it should not make any other subsequent query (at least query which does not refer that UDF) to fail right? . Any insert or select on the existing table itself is failing. [~ajithshetty] Yes,it makes all the subsequent query to fail,not only the query which refers to that UDF. > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784506#comment-16784506 ] Ajith S commented on SPARK-26602: - [~chakravarthi] Hi, thanks for reporting the issue, From your example it looks like a missing jar will cause any subsequent sqls (*sqls which do not refer to UDF*) also to fail in this session. Right.? cc [~srowen] > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784446#comment-16784446 ] Sean Owen commented on SPARK-26602: --- That sounds like user error. I'd close this as NotAProblem. It will cause a less-clear error later anyway > Insert into table fails after querying the UDF which is loaded with wrong > hdfs path > --- > > Key: SPARK-26602 > URL: https://issues.apache.org/jira/browse/SPARK-26602 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Haripriya >Priority: Major > Attachments: beforeFixUdf.txt > > > In sql, > 1.Query the existing udf(say myFunc1) > 2. create and select the udf registered with incorrect path (say myFunc2) > 3.Now again query the existing udf in the same session - Wil throw exception > stating that couldn't read resource of myFunc2's path > 4.Even the basic operations like insert and select will fail giving the same > error > Result: > java.lang.RuntimeException: Failed to read external resource > hdfs:///tmp/hari_notexists1/two_udfs.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1288) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1242) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:737) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:706) > at > org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:696) > at > org.apache.spark.sql.hive.client.HiveClientImpl.addJar(HiveClientImpl.scala:841) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:112) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org