[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929916#comment-16929916 ] feiwang commented on SPARK-29037: - [~advancedxy] Hi, I found that even with dynamicPartitionOverwrite, spark may still give duplicate result for the case below. And I have create a pull request. https://github.com/apache/spark/pull/25795 Case: Application appA insert overwrite table table_a with static partition overwrite. But it was killed when committing tasks, because one task is hang. And parts of its committed tasks output is kept under /path/table_a/_temporary/0/. Then we run application appB insert overwrite table table_a with dynamic partition overwrite. It executes successfully. But it also commit the data under /path/table_a/_temporary/0/ to destination dir. > [Core] Spark gives duplicate result when an application was killed and rerun > > > Key: SPARK-29037 > URL: https://issues.apache.org/jira/browse/SPARK-29037 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.3.3 >Reporter: feiwang >Priority: Major > Attachments: screenshot-1.png > > > When we insert overwrite a partition of table. > For a stage, whose tasks commit output, a task saves output to a staging dir > firstly, when this task complete, it will save output to committedTaskPath, > when all tasks of this stage success, all task output under committedTaskPath > will be moved to destination dir. > However, when we kill an application, which is committing tasks' output, > parts of tasks' results will be kept in committedTaskPath, which would not be > cleared gracefully. > Then we rerun this application and the new application will reuse this > committedTaskPath dir. > And when the task commit stage of new application success, all task output > under this committedTaskPath, which contains parts of old application's task > output , would be moved to destination dir and the result is duplicated. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928674#comment-16928674 ] feiwang edited comment on SPARK-29037 at 9/15/19 4:41 AM: -- [~advancedxy] Thanks for your reply. I will learn more about dynamic partition. Thanks for your suggestion. was (Author: hzfeiwang): [~advancedxy] Thanks for your reply. I just checked the code, as shown below. https://github.com/apache/spark/blob/c56a012bc839cd2f92c2be41faea91d1acfba4eb/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L105-L106 {code:java} val dynamicPartitionOverwrite = enableDynamicOverwrite && mode == SaveMode.Overwrite && staticPartitions.size < partitionColumns.length {code} When partitionColumns==1, for the operation of inserting overwrite table partition, dynamicPartitionOverwrite is always false even DynamicOverwrite is enabled. I will learn more about dynamic partition. Thanks for your suggestion. > [Core] Spark gives duplicate result when an application was killed and rerun > > > Key: SPARK-29037 > URL: https://issues.apache.org/jira/browse/SPARK-29037 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.3.3 >Reporter: feiwang >Priority: Major > Attachments: screenshot-1.png > > > When we insert overwrite a partition of table. > For a stage, whose tasks commit output, a task saves output to a staging dir > firstly, when this task complete, it will save output to committedTaskPath, > when all tasks of this stage success, all task output under committedTaskPath > will be moved to destination dir. > However, when we kill an application, which is committing tasks' output, > parts of tasks' results will be kept in committedTaskPath, which would not be > cleared gracefully. > Then we rerun this application and the new application will reuse this > committedTaskPath dir. > And when the task commit stage of new application success, all task output > under this committedTaskPath, which contains parts of old application's task > output , would be moved to destination dir and the result is duplicated. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29080) Support R file extension case-insensitively
[ https://issues.apache.org/jira/browse/SPARK-29080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29080: -- Summary: Support R file extension case-insensitively (was: Make r file extension check case insensitive) > Support R file extension case-insensitively > --- > > Key: SPARK-29080 > URL: https://issues.apache.org/jira/browse/SPARK-29080 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29087) Use DelegatingServletContextHandler to avoid CCE
[ https://issues.apache.org/jira/browse/SPARK-29087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29087: -- Component/s: Spark Core > Use DelegatingServletContextHandler to avoid CCE > > > Key: SPARK-29087 > URL: https://issues.apache.org/jira/browse/SPARK-29087 > Project: Spark > Issue Type: Improvement > Components: Spark Core, Structured Streaming >Affects Versions: 2.4.0, 3.0.0 >Reporter: Dongjoon Hyun >Priority: Major > > SPARK-27122 fixes `ClassCastException` at `yarn` module by using > `DelegatingServletContextHandler`. Initially, this was discovered with JDK9+, > but the class path issues affected in JDK8, too. This issue aims to fix > `streaming` module. > {code} > $ build/mvn test -pl streaming > ... > UISeleniumSuite: > - attaching and detaching a Streaming tab *** FAILED *** > java.lang.ClassCastException: > org.sparkproject.jetty.servlet.ServletContextHandler cannot be cast to > org.eclipse.jetty.servlet.ServletContextHandler > ... > Tests: succeeded 337, failed 1, canceled 0, ignored 1, pending 0 > *** 1 TEST FAILED *** > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29087) Use DelegatingServletContextHandler to avoid CCE
[ https://issues.apache.org/jira/browse/SPARK-29087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29087: -- Component/s: (was: Structured Streaming) DStreams > Use DelegatingServletContextHandler to avoid CCE > > > Key: SPARK-29087 > URL: https://issues.apache.org/jira/browse/SPARK-29087 > Project: Spark > Issue Type: Improvement > Components: DStreams, Spark Core >Affects Versions: 2.4.0, 3.0.0 >Reporter: Dongjoon Hyun >Priority: Major > > SPARK-27122 fixes `ClassCastException` at `yarn` module by using > `DelegatingServletContextHandler`. Initially, this was discovered with JDK9+, > but the class path issues affected in JDK8, too. This issue aims to fix > `streaming` module. > {code} > $ build/mvn test -pl streaming > ... > UISeleniumSuite: > - attaching and detaching a Streaming tab *** FAILED *** > java.lang.ClassCastException: > org.sparkproject.jetty.servlet.ServletContextHandler cannot be cast to > org.eclipse.jetty.servlet.ServletContextHandler > ... > Tests: succeeded 337, failed 1, canceled 0, ignored 1, pending 0 > *** 1 TEST FAILED *** > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29087) Use DelegatingServletContextHandler to avoid CCE
Dongjoon Hyun created SPARK-29087: - Summary: Use DelegatingServletContextHandler to avoid CCE Key: SPARK-29087 URL: https://issues.apache.org/jira/browse/SPARK-29087 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 2.4.0, 3.0.0 Reporter: Dongjoon Hyun SPARK-27122 fixes `ClassCastException` at `yarn` module by using `DelegatingServletContextHandler`. Initially, this was discovered with JDK9+, but the class path issues affected in JDK8, too. This issue aims to fix `streaming` module. {code} $ build/mvn test -pl streaming ... UISeleniumSuite: - attaching and detaching a Streaming tab *** FAILED *** java.lang.ClassCastException: org.sparkproject.jetty.servlet.ServletContextHandler cannot be cast to org.eclipse.jetty.servlet.ServletContextHandler ... Tests: succeeded 337, failed 1, canceled 0, ignored 1, pending 0 *** 1 TEST FAILED *** [INFO] [INFO] BUILD FAILURE [INFO] {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26989) Flaky test:DAGSchedulerSuite.Barrier task failures from the same stage attempt don't trigger multiple stage retries
[ https://issues.apache.org/jira/browse/SPARK-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26989: -- Affects Version/s: 2.4.0 2.4.1 2.4.2 2.4.3 2.4.4 > Flaky test:DAGSchedulerSuite.Barrier task failures from the same stage > attempt don't trigger multiple stage retries > --- > > Key: SPARK-26989 > URL: https://issues.apache.org/jira/browse/SPARK-26989 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 3.0.0 >Reporter: Marcelo Vanzin >Assignee: Jungtaek Lim >Priority: Major > Fix For: 3.0.0 > > > https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/102761/testReport/junit/org.apache.spark.scheduler/DAGSchedulerSuite/Barrier_task_failures_from_the_same_stage_attempt_don_t_trigger_multiple_stage_retries/ > {noformat} > org.apache.spark.scheduler.DAGSchedulerSuite.Barrier task failures from the > same stage attempt don't trigger multiple stage retries > Error Message > org.scalatest.exceptions.TestFailedException: ArrayBuffer() did not equal > List(0) > Stacktrace > sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedException: > ArrayBuffer() did not equal List(0) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527) > at > org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560) > at > org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501) > at > org.apache.spark.scheduler.DAGSchedulerSuite.$anonfun$new$144(DAGSchedulerSuite.scala:2644) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) > at > org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) > at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) > at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) > at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) > at > org.apache.spark.scheduler.DAGSchedulerSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(DAGSchedulerSuite.scala:122) > {noformat} > - > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109303/consoleFull > {code} > - Barrier task failures from the same stage attempt don't trigger multiple > stage retries *** FAILED *** > ArrayBuffer(0) did not equal List(0) (DAGSchedulerSuite.scala:2656) > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29045) Test failed due to table already exists in SQLMetricsSuite
[ https://issues.apache.org/jira/browse/SPARK-29045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29045: -- Affects Version/s: 2.3.0 2.4.0 > Test failed due to table already exists in SQLMetricsSuite > -- > > Key: SPARK-29045 > URL: https://issues.apache.org/jira/browse/SPARK-29045 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.3.0, 2.4.0, 3.0.0 >Reporter: Lantao Jin >Assignee: Lantao Jin >Priority: Minor > Fix For: 3.0.0 > > > In method {{SQLMetricsTestUtils.testMetricsDynamicPartition()}}, there is a > CREATE TABLE sentence without {{withTable}} block. It causes test failure if > use same table name in other unit tests. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29045) Test failed due to table already exists in SQLMetricsSuite
[ https://issues.apache.org/jira/browse/SPARK-29045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929829#comment-16929829 ] Dongjoon Hyun commented on SPARK-29045: --- This is backported to `branch-2.4` via https://github.com/apache/spark/commit/339b0f2a0c4043fca9cca52797936c8654910fc9 > Test failed due to table already exists in SQLMetricsSuite > -- > > Key: SPARK-29045 > URL: https://issues.apache.org/jira/browse/SPARK-29045 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.3.0, 2.4.0, 3.0.0 >Reporter: Lantao Jin >Assignee: Lantao Jin >Priority: Minor > Fix For: 2.4.5, 3.0.0 > > > In method {{SQLMetricsTestUtils.testMetricsDynamicPartition()}}, there is a > CREATE TABLE sentence without {{withTable}} block. It causes test failure if > use same table name in other unit tests. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29045) Test failed due to table already exists in SQLMetricsSuite
[ https://issues.apache.org/jira/browse/SPARK-29045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29045: -- Fix Version/s: 2.4.5 > Test failed due to table already exists in SQLMetricsSuite > -- > > Key: SPARK-29045 > URL: https://issues.apache.org/jira/browse/SPARK-29045 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.3.0, 2.4.0, 3.0.0 >Reporter: Lantao Jin >Assignee: Lantao Jin >Priority: Minor > Fix For: 2.4.5, 3.0.0 > > > In method {{SQLMetricsTestUtils.testMetricsDynamicPartition()}}, there is a > CREATE TABLE sentence without {{withTable}} block. It causes test failure if > use same table name in other unit tests. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24663) Flaky test: StreamingContextSuite "stop slow receiver gracefully"
[ https://issues.apache.org/jira/browse/SPARK-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24663: -- Fix Version/s: 2.4.5 Affects Version/s: 2.4.1 2.4.2 2.4.3 2.4.4 > Flaky test: StreamingContextSuite "stop slow receiver gracefully" > - > > Key: SPARK-24663 > URL: https://issues.apache.org/jira/browse/SPARK-24663 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 3.0.0 >Reporter: Marcelo Vanzin >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 2.4.5, 3.0.0 > > > This is another test that sometimes fails on our build machines, although I > can't find failures on the riselab jenkins servers. Failure looks like: > {noformat} > org.scalatest.exceptions.TestFailedException: 0 was not greater than 0 > at > org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500) > at > org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555) > at > org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply$mcV$sp(StreamingContextSuite.scala:356) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply(StreamingContextSuite.scala:335) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply(StreamingContextSuite.scala:335) > {noformat} > The test fails in about 2s, while a successful run generally takes 15s. > Looking at the logs, the receiver hasn't even started when things fail, which > points at a race during test initialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24663) Flaky test: StreamingContextSuite "stop slow receiver gracefully"
[ https://issues.apache.org/jira/browse/SPARK-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929824#comment-16929824 ] Dongjoon Hyun commented on SPARK-24663: --- This is backported to branch-2.4 via https://github.com/apache/spark/commit/637a6c2750be8d4f42b1fd11c4cca8d0067e80d8 > Flaky test: StreamingContextSuite "stop slow receiver gracefully" > - > > Key: SPARK-24663 > URL: https://issues.apache.org/jira/browse/SPARK-24663 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 3.0.0 >Reporter: Marcelo Vanzin >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 2.4.5, 3.0.0 > > > This is another test that sometimes fails on our build machines, although I > can't find failures on the riselab jenkins servers. Failure looks like: > {noformat} > org.scalatest.exceptions.TestFailedException: 0 was not greater than 0 > at > org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500) > at > org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555) > at > org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:466) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply$mcV$sp(StreamingContextSuite.scala:356) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply(StreamingContextSuite.scala:335) > at > org.apache.spark.streaming.StreamingContextSuite$$anonfun$24.apply(StreamingContextSuite.scala:335) > {noformat} > The test fails in about 2s, while a successful run generally takes 15s. > Looking at the logs, the receiver hasn't even started when things fail, which > points at a race during test initialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28372) Document Spark WEB UI
[ https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28372. - Fix Version/s: 3.0.0 Resolution: Fixed > Document Spark WEB UI > - > > Key: SPARK-28372 > URL: https://issues.apache.org/jira/browse/SPARK-28372 > Project: Spark > Issue Type: Umbrella > Components: Documentation, Web UI >Affects Versions: 3.0.0 >Reporter: Xiao Li >Priority: Major > Fix For: 3.0.0 > > > Spark web UIs are being used to monitor the status and resource consumption > of your Spark applications and clusters. However, we do not have the > corresponding document. It is hard for end users to use and understand them. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28373) Document JDBC/ODBC Server page
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28373. - Fix Version/s: 3.0.0 Assignee: Pablo Langa Blanco Resolution: Fixed > Document JDBC/ODBC Server page > -- > > Key: SPARK-28373 > URL: https://issues.apache.org/jira/browse/SPARK-28373 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Web UI >Affects Versions: 3.0.0 >Reporter: Xiao Li >Assignee: Pablo Langa Blanco >Priority: Major > Fix For: 3.0.0 > > > !https://user-images.githubusercontent.com/5399861/60809590-9dcf2500-a1bd-11e9-826e-33729bb97daf.png|width=1720,height=503! > > [https://github.com/apache/spark/pull/25062] added a new column CLOSE TIME > and EXECUTION TIME. It is hard to understand the difference. We need to > document them; otherwise, it is hard for end users to understand them > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28927) ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances
[ https://issues.apache.org/jira/browse/SPARK-28927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh reassigned SPARK-28927: --- Assignee: Liang-Chi Hsieh > ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets > with 12 billion instances > --- > > Key: SPARK-28927 > URL: https://issues.apache.org/jira/browse/SPARK-28927 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.1 >Reporter: Qiang Wang >Assignee: Liang-Chi Hsieh >Priority: Major > Attachments: image-2019-09-02-11-55-33-596.png > > > The stack trace is below: > {quote}19/08/28 07:00:40 WARN Executor task launch worker for task 325074 > BlockManager: Block rdd_10916_493 could not be removed as it was not found on > disk or in memory 19/08/28 07:00:41 ERROR Executor task launch worker for > task 325074 Executor: Exception in task 3.0 in stage 347.1 (TID 325074) > java.lang.ArrayIndexOutOfBoundsException: 6741 at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1460) > at > org.apache.spark.dpshade.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1440) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at > org.apache.spark.rdd.PairRDDFunctions$$anonfun$mapValues$1$$anonfun$apply$40$$anonfun$apply$41.apply(PairRDDFunctions.scala:760) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1041) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1032) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:972) at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1032) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:763) > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:285) at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:141) > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$2.apply(CoGroupedRDD.scala:137) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.immutable.List.foreach(List.scala:381) at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:137) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at > org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at > org.apache.spark.scheduler.Task.run(Task.scala:108) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:358) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {quote} > This exception happened sometimes. And we also found that the AUC metric was > not stable when evaluating the inner product of the user factors and the item > factors with the same dataset and configuration. AUC varied from 0.60 to 0.67 > which was not stable for production environment. > Dataset capacity: ~12 billion ratings > Here is the our code: > val trainData = predataUser.flatMap(x => x._1._2.map(y => (x._2.toInt, y._1, > y._2.toFloat))) > .setName(trainDataName).persist(StorageLevel.MEMORY_AND_DISK_SER)case class > ALSData(user:Int, item:Int, rating:Float) extends Serializable > val ratingData = trainData.map(x => ALSData(x._1, x._2, x._3)).toDF() > val als = new ALS > val paramMap = ParamM
[jira] [Updated] (SPARK-29046) Possible NPE on SQLConf.get when SparkContext is stopping in another thread
[ https://issues.apache.org/jira/browse/SPARK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-29046: -- Fix Version/s: (was: 3.0.0) > Possible NPE on SQLConf.get when SparkContext is stopping in another thread > --- > > Key: SPARK-29046 > URL: https://issues.apache.org/jira/browse/SPARK-29046 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > > We encountered NPE in listener code which deals with query plan - and > according to the stack trace below, only possible case of NPE is > SparkContext._dagScheduler being null, which is only possible while stopping > SparkContext (unless null is set from outside). > > {code:java} > 19/09/11 00:22:24 INFO server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > ui.SparkUI: Stopped Spark web UI at http://:3277019/09/11 00:22:24 INFO > cluster.YarnClusterSchedulerBackend: Shutting down all executors19/09/11 > 00:22:24 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each > executor to shut down19/09/11 00:22:24 INFO > cluster.SchedulerExtensionServices: Stopping > SchedulerExtensionServices(serviceOption=None, services=List(), > started=false)19/09/11 00:22:24 WARN sql.SparkExecutionPlanProcessor: Caught > exception during parsing eventjava.lang.NullPointerException at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > scala.Option.map(Option.scala:146) at > org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:133) at > org.apache.spark.sql.types.StructType.simpleString(StructType.scala:352) at > com.hortonworks.spark.atlas.types.internal$.sparkTableToEntity(internal.scala:102) > at > com.hortonworks.spark.atlas.types.AtlasEntityUtils$class.tableToEntity(AtlasEntityUtils.scala:62) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.tableToEntity(CommandsHarvester.scala:45) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:240) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:239) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:239) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateDataSourceTableAsSelectHarvester$.harvest(CommandsHarvester.scala:104) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:89) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:89) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:63) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:72) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:71) > at scala.Option.foreach(Option.scala:257) at > com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:71) > at > com.hortonworks.spark.atla
[jira] [Reopened] (SPARK-29046) Possible NPE on SQLConf.get when SparkContext is stopping in another thread
[ https://issues.apache.org/jira/browse/SPARK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-29046: --- > Possible NPE on SQLConf.get when SparkContext is stopping in another thread > --- > > Key: SPARK-29046 > URL: https://issues.apache.org/jira/browse/SPARK-29046 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 3.0.0 > > > We encountered NPE in listener code which deals with query plan - and > according to the stack trace below, only possible case of NPE is > SparkContext._dagScheduler being null, which is only possible while stopping > SparkContext (unless null is set from outside). > > {code:java} > 19/09/11 00:22:24 INFO server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > ui.SparkUI: Stopped Spark web UI at http://:3277019/09/11 00:22:24 INFO > cluster.YarnClusterSchedulerBackend: Shutting down all executors19/09/11 > 00:22:24 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each > executor to shut down19/09/11 00:22:24 INFO > cluster.SchedulerExtensionServices: Stopping > SchedulerExtensionServices(serviceOption=None, services=List(), > started=false)19/09/11 00:22:24 WARN sql.SparkExecutionPlanProcessor: Caught > exception during parsing eventjava.lang.NullPointerException at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > scala.Option.map(Option.scala:146) at > org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:133) at > org.apache.spark.sql.types.StructType.simpleString(StructType.scala:352) at > com.hortonworks.spark.atlas.types.internal$.sparkTableToEntity(internal.scala:102) > at > com.hortonworks.spark.atlas.types.AtlasEntityUtils$class.tableToEntity(AtlasEntityUtils.scala:62) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.tableToEntity(CommandsHarvester.scala:45) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:240) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:239) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:239) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateDataSourceTableAsSelectHarvester$.harvest(CommandsHarvester.scala:104) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:89) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:89) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:63) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:72) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:71) > at scala.Option.foreach(Option.scala:257) at > com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:71) > at > com.hortonworks.spark.atlas.A
[jira] [Commented] (SPARK-29046) Possible NPE on SQLConf.get when SparkContext is stopping in another thread
[ https://issues.apache.org/jira/browse/SPARK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929765#comment-16929765 ] Dongjoon Hyun commented on SPARK-29046: --- This is reverted in order to recover Jenkins jobs. > Possible NPE on SQLConf.get when SparkContext is stopping in another thread > --- > > Key: SPARK-29046 > URL: https://issues.apache.org/jira/browse/SPARK-29046 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 3.0.0 > > > We encountered NPE in listener code which deals with query plan - and > according to the stack trace below, only possible case of NPE is > SparkContext._dagScheduler being null, which is only possible while stopping > SparkContext (unless null is set from outside). > > {code:java} > 19/09/11 00:22:24 INFO server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > server.AbstractConnector: Stopped > Spark@49d8c117{HTTP/1.1,[http/1.1]}{0.0.0.0:0}19/09/11 00:22:24 INFO > ui.SparkUI: Stopped Spark web UI at http://:3277019/09/11 00:22:24 INFO > cluster.YarnClusterSchedulerBackend: Shutting down all executors19/09/11 > 00:22:24 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each > executor to shut down19/09/11 00:22:24 INFO > cluster.SchedulerExtensionServices: Stopping > SchedulerExtensionServices(serviceOption=None, services=List(), > started=false)19/09/11 00:22:24 WARN sql.SparkExecutionPlanProcessor: Caught > exception during parsing eventjava.lang.NullPointerException at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > org.apache.spark.sql.internal.SQLConf$$anonfun$15.apply(SQLConf.scala:133) at > scala.Option.map(Option.scala:146) at > org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:133) at > org.apache.spark.sql.types.StructType.simpleString(StructType.scala:352) at > com.hortonworks.spark.atlas.types.internal$.sparkTableToEntity(internal.scala:102) > at > com.hortonworks.spark.atlas.types.AtlasEntityUtils$class.tableToEntity(AtlasEntityUtils.scala:62) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.tableToEntity(CommandsHarvester.scala:45) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:240) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$$anonfun$com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities$1.apply(CommandsHarvester.scala:239) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.CommandsHarvester$.com$hortonworks$spark$atlas$sql$CommandsHarvester$$discoverInputsEntities(CommandsHarvester.scala:239) > at > com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateDataSourceTableAsSelectHarvester$.harvest(CommandsHarvester.scala:104) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:138) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:89) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at > scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:89) > at > com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:63) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:72) > at > com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:71) > at scala.Option.foreach(Option.scala:257) at > com.hortonworks.spark.atlas.
[jira] [Updated] (SPARK-29086) Use added jar's class as Serde class, SparkGetColumnsOperation return empty columns
[ https://issues.apache.org/jira/browse/SPARK-29086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29086: -- Summary: Use added jar's class as Serde class, SparkGetColumnsOperation return empty columns (was: In jdk11, SparkGetColumnsOperation return empty columns) > Use added jar's class as Serde class, SparkGetColumnsOperation return empty > columns > --- > > Key: SPARK-29086 > URL: https://issues.apache.org/jira/browse/SPARK-29086 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: angerszhu >Priority: Major > > In jdk11, > When we create a table with jar added by 'ADD JAR' sql. > When we restart again, we use !columns table_name, we get empty seq of column: > {code:java} > 0: jdbc:hive2://localhost:1/default> add jar > /Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar; > INFO : Added > [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] > to class path > INFO : Added resources: > [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] > +-+ > | result | > +-+ > +-+ > No rows selected (0.268 seconds) > 0: jdbc:hive2://localhost:1/default> CREATE TABLE addJar18(key string) > ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; > +-+ > | Result | > +-+ > +-+ > No rows selected (0.444 seconds) > 0: jdbc:hive2://localhost:1/default> !columns addJar18 > ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ > | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | > TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX > | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | > CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | > SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | > ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ > | NULL | default | addjar18| key | 12 | > STRING | NULL | NULL | NULL| NULL > | 1 | | NULL| NULL | NULL | > NULL | NULL | YES | NULL | NULL > | NULL | NULL | NO | > ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ > 0: jdbc:hive2://localhost:1/default> exit > {code} > Then we restart Spark thrift server reconnect to it: > {code:java} > 0: jdbc:hive2://localhost:1/default> select * from addJar18; > Error: Error running query: java.lang.RuntimeException: > java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe > (state=,code=0) > 0: jdbc:hive2://localhost:1/default> !columns addJar18 > ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ > | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | > TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX > | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | > CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | > SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | > ++--+-+--+++--+---
[jira] [Updated] (SPARK-29086) In jdk11, SparkGetColumnsOperation return empty columns
[ https://issues.apache.org/jira/browse/SPARK-29086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29086: -- Description: In jdk11, When we create a table with jar added by 'ADD JAR' sql. When we restart again, we use !columns table_name, we get empty seq of column: {code:java} 0: jdbc:hive2://localhost:1/default> add jar /Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar; INFO : Added [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] to class path INFO : Added resources: [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] +-+ | result | +-+ +-+ No rows selected (0.268 seconds) 0: jdbc:hive2://localhost:1/default> CREATE TABLE addJar18(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; +-+ | Result | +-+ +-+ No rows selected (0.444 seconds) 0: jdbc:hive2://localhost:1/default> !columns addJar18 ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | NULL | default | addjar18| key | 12 | STRING | NULL | NULL | NULL| NULL| 1 | | NULL| NULL | NULL | NULL | NULL | YES | NULL | NULL | NULL | NULL | NO | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ 0: jdbc:hive2://localhost:1/default> exit {code} Then we restart Spark thrift server reconnect to it: {code:java} 0: jdbc:hive2://localhost:1/default> select * from addJar18; Error: Error running query: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe (state=,code=0) 0: jdbc:hive2://localhost:1/default> !columns addJar18 ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ 0: jdbc:hive2://localhost:1/default> add jar /Users/angerszhu/.m2/reposi
[jira] [Updated] (SPARK-29086) In jdk11, SparkGetColumnsOperation return empty columns
[ https://issues.apache.org/jira/browse/SPARK-29086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angerszhu updated SPARK-29086: -- Description: In jdk11, When we create a table with jar added by 'ADD JAR' sql. When we restart again, we use !columns table_name, we get empty seq of column: {code:java} 0: jdbc:hive2://localhost:1/default> add jar /Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar; INFO : Added [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] to class path INFO : Added resources: [/Users/angerszhu/.m2/repository/org/apache/hive/hcatalog/hive-hcatalog-core/2.3.6/hive-hcatalog-core-2.3.6.jar] +-+ | result | +-+ +-+ No rows selected (0.268 seconds) 0: jdbc:hive2://localhost:1/default> CREATE TABLE addJar18(key string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; +-+ | Result | +-+ +-+ No rows selected (0.444 seconds) 0: jdbc:hive2://localhost:1/default> !columns addJar18 ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | NULL | default | addjar18| key | 12 | STRING | NULL | NULL | NULL| NULL| 1 | | NULL| NULL | NULL | NULL | NULL | YES | NULL | NULL | NULL | NULL | NO | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ 0: jdbc:hive2://localhost:1/default> exit {code} Then we restart Spark thrift server reconnect to it: {code:java} 0: jdbc:hive2://localhost:1/default> select * from addJar18; Error: Error running query: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe (state=,code=0) 0: jdbc:hive2://localhost:1/default> !columns addJar18 ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | SOURCE_DATA_TYPE | IS_AUTO_INCREMENT | ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ ++--+-+--+++--++-+-+---+--+-++---++---+--++---+--+---++ 0: jdbc:hive2://localhost:1/default> add jar /Users/angerszhu/.m2/reposi
[jira] [Created] (SPARK-29086) In jdk11, SparkGetColumnsOperation return empty columns
angerszhu created SPARK-29086: - Summary: In jdk11, SparkGetColumnsOperation return empty columns Key: SPARK-29086 URL: https://issues.apache.org/jira/browse/SPARK-29086 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: angerszhu -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org