[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964137#comment-16964137 ] Dongjoon Hyun commented on SPARK-29604: --- Thank you for keeping working on this. Yes. It passed locally. That's the reason why I didn't revert this patch until now. But, we are on 3.0.0-preview voting. In the worst case, we need to revert this. > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963737#comment-16963737 ] Jungtaek Lim commented on SPARK-29604: -- I've manually ran the test suite locally (single run) and it passed 3 times sequentially. {code} $ java -version openjdk version "11.0.2" 2019-01-15 OpenJDK Runtime Environment 18.9 (build 11.0.2+9) OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode) {code} {code} $ build/sbt "hive-thriftserver/testOnly *.SparkSQLEnvSuite" -Phadoop-3.2 -Phive-thriftserver ... [info] SparkSQLEnvSuite: WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/jlim/WorkArea/ScalaProjects/spark/common/unsafe/target/scala-2.12/spark-unsafe_2.12-3.0.0-SNAPSHOT.jar) to constructor java.nio.DirectByteBuffer(long,int) WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release [info] - SPARK-29604 external listeners should be initialized with Spark classloader (2 minutes, 26 seconds) [info] ScalaTest [info] Run completed in 2 minutes, 30 seconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [info] Passed: Total 1, Failed 0, Errors 0, Passed 1 [success] Total time: 319 s, completed Oct 31, 2019, 4:30:34 PM {code} Maybe I should add this suite to `testsWhichShouldRunInTheirOwnDedicatedJvm` - I cannot find any other way to isolate the test suite. > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963693#comment-16963693 ] Dongjoon Hyun commented on SPARK-29604: --- I don't know. It makes sense. In addition to that, according to the log, if some test fails, this test suite seems to fail together. - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/682/testReport/junit/org.apache.spark.sql.hive.thriftserver/SparkSQLEnvSuite/SPARK_29604_external_listeners_should_be_initialized_with_Spark_classloader/history/ please see `SBT Hadoop-3.2` Jenkins job. > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963476#comment-16963476 ] Jungtaek Lim commented on SPARK-29604: -- [~dongjoon] Do we have any annotation/trait to "isolate" running test suite? I'm suspecting session, or listeners in session is being modified from other tests running concurrently. > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963264#comment-16963264 ] Dongjoon Hyun commented on SPARK-29604: --- Thank you so much for confirming, [~kabhwan]! The newly added test case seems to be flaky in `SBT Hadoop 3.2` build. Could you check that? - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/676/testReport/org.apache.spark.sql.hive.thriftserver/SparkSQLEnvSuite/SPARK_29604_external_listeners_should_be_initialized_with_Spark_classloader/history/ > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963000#comment-16963000 ] Jungtaek Lim commented on SPARK-29604: -- I think it doesn't apply to branch-2.3 as the root issue is more alike lazy initialization of streaming query listeners and there's no configuration for registering streaming query listeners in Spark 2.3. (only API) > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962859#comment-16962859 ] Dongjoon Hyun commented on SPARK-29604: --- BTW, [~kabhwan]. Could you check the old version (at least `2.3.x`) and update `Affects Version/s:`, too? > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962851#comment-16962851 ] Dongjoon Hyun commented on SPARK-29604: --- This lands at `branch-2.4` via https://github.com/apache/spark/pull/26316 . > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at >
[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959711#comment-16959711 ] Jungtaek Lim commented on SPARK-29604: -- I've figured out the root cause and have a patch. Will submit a patch soon. I may need some more time to craft a relevant test. > SessionState is initialized with isolated classloader for Hive if > spark.sql.hive.metastore.jars is being set > > > Key: SPARK-29604 > URL: https://issues.apache.org/jira/browse/SPARK-29604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.4, 3.0.0 >Reporter: Jungtaek Lim >Priority: Major > > I've observed the issue that external listeners cannot be loaded properly > when we run spark-sql with "spark.sql.hive.metastore.jars" configuration > being used. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Error while > instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': > at > org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1102) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:154) > at > org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:153) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:153) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:150) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$2.apply(SparkSession.scala:104) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:104) > at > org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:103) > at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:149) > at > org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$client(HiveClientImpl.scala:282) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:306) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:247) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:246) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:386) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > at > org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) > at > org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) > at > org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:315) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at >