[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Attachment: wx20190124-192...@2x.png wx20190124-192...@2x.png > ImageSchemaSuite has some errors when running it in local laptop > > > Key: SPARK-26710 > URL: https://issues.apache.org/jira/browse/SPARK-26710 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.4.0 >Reporter: xubo245 >Priority: Major > Attachments: wx20190124-192...@2x.png, wx20190124-192...@2x.png > > > ImageSchemaSuite and org.apache.spark.ml.source.image.ImageFileFormatSuite > has some errors when running it in local laptop > !wx20190124-192...@2x.png! !wx20190124-192...@2x.png! > {code:java} > execute, tree: > Exchange SinglePartition > +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], > output=[count#17L]) >+- *(1) Project > +- *(1) Scan ExistingRDD[image#10] > org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: > Exchange SinglePartition > +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], > output=[count#17L]) >+- *(1) Project > +- *(1) Scan ExistingRDD[image#10] > at > org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) > at > org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:129) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:488) > at > org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:429) > at > org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:428) > at > org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:472) > at > org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:154) > at > org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:719) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) > at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:2756) > at > org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:2755) > at > org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) > at > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) > at org.apache.spark.sql.Dataset.count(Dataset.scala:2755) > at > org.apache.spark.ml.image.ImageSchemaSuite.$anonfun$new$2(ImageSchemaSuite.scala:53) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) > at > org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) > at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) > at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) > at
[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Description: ImageSchemaSuite and org.apache.spark.ml.source.image.ImageFileFormatSuite has some errors when running it in local laptop !wx20190124-192...@2x.png! !wx20190124-192...@2x.png! {code:java} execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:129) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:488) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:429) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:428) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:472) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:154) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:719) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:2756) at org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:2755) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) at org.apache.spark.sql.Dataset.count(Dataset.scala:2755) at org.apache.spark.ml.image.ImageSchemaSuite.$anonfun$new$2(ImageSchemaSuite.scala:53) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at org.scalatest.FunSuite.runTest(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at scala.collection.immutable.List.foreach(List.scala:392) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229) at
[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Description: ImageSchemaSuite and org.apache.spark.ml.source.image.ImageFileFormatSuite has some errors when running it in local laptop {code:java} execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:129) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:488) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:429) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:428) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:472) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:154) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:719) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:2756) at org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:2755) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) at org.apache.spark.sql.Dataset.count(Dataset.scala:2755) at org.apache.spark.ml.image.ImageSchemaSuite.$anonfun$new$2(ImageSchemaSuite.scala:53) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at org.scalatest.FunSuite.runTest(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at scala.collection.immutable.List.foreach(List.scala:392) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228) at
[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Description: ImageSchemaSuite and org.apache.spark.ml.source.image.ImageFileFormatSuite has some errors when running it in local laptop {code:java} execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:129) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:488) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:429) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:428) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:472) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:154) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:719) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:2756) at org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:2755) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) at org.apache.spark.sql.Dataset.count(Dataset.scala:2755) at org.apache.spark.ml.image.ImageSchemaSuite.$anonfun$new$2(ImageSchemaSuite.scala:53) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at org.scalatest.FunSuite.runTest(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at scala.collection.immutable.List.foreach(List.scala:392) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228) at
[jira] [Created] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop
xubo245 created SPARK-26710: --- Summary: ImageSchemaSuite has some errors when running it in local laptop Key: SPARK-26710 URL: https://issues.apache.org/jira/browse/SPARK-26710 Project: Spark Issue Type: Bug Components: Tests Affects Versions: 2.4.0 Reporter: xubo245 ImageSchemaSuite has some errors when running it in local laptop {code:java} execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#17L]) +- *(1) Project +- *(1) Scan ExistingRDD[image#10] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:129) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:488) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:429) at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:428) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:472) at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:154) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:719) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296) at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:2756) at org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:2755) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) at org.apache.spark.sql.Dataset.count(Dataset.scala:2755) at org.apache.spark.ml.image.ImageSchemaSuite.$anonfun$new$2(ImageSchemaSuite.scala:53) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:104) at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at org.scalatest.FunSuite.runTest(FunSuite.scala:1560) at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at scala.collection.immutable.List.foreach(List.scala:392) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at
[jira] [Commented] (SPARK-21866) SPIP: Image support in Spark
[ https://issues.apache.org/jira/browse/SPARK-21866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360417#comment-16360417 ] xubo245 commented on SPARK-21866: - Are the summary for TODO work of image feature? When will it plan to finish? > SPIP: Image support in Spark > > > Key: SPARK-21866 > URL: https://issues.apache.org/jira/browse/SPARK-21866 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.2.0 >Reporter: Timothy Hunter >Assignee: Ilya Matiach >Priority: Major > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIP - Image support for Apache Spark V1.1.pdf > > > h2. Background and motivation > As Apache Spark is being used more and more in the industry, some new use > cases are emerging for different data formats beyond the traditional SQL > types or the numerical types (vectors and matrices). Deep Learning > applications commonly deal with image processing. A number of projects add > some Deep Learning capabilities to Spark (see list below), but they struggle > to communicate with each other or with MLlib pipelines because there is no > standard way to represent an image in Spark DataFrames. We propose to > federate efforts for representing images in Spark by defining a > representation that caters to the most common needs of users and library > developers. > This SPIP proposes a specification to represent images in Spark DataFrames > and Datasets (based on existing industrial standards), and an interface for > loading sources of images. It is not meant to be a full-fledged image > processing library, but rather the core description that other libraries and > users can rely on. Several packages already offer various processing > facilities for transforming images or doing more complex operations, and each > has various design tradeoffs that make them better as standalone solutions. > This project is a joint collaboration between Microsoft and Databricks, which > have been testing this design in two open source packages: MMLSpark and Deep > Learning Pipelines. > The proposed image format is an in-memory, decompressed representation that > targets low-level applications. It is significantly more liberal in memory > usage than compressed image representations such as JPEG, PNG, etc., but it > allows easy communication with popular image processing libraries and has no > decoding overhead. > h2. Targets users and personas: > Data scientists, data engineers, library developers. > The following libraries define primitives for loading and representing > images, and will gain from a common interchange format (in alphabetical > order): > * BigDL > * DeepLearning4J > * Deep Learning Pipelines > * MMLSpark > * TensorFlow (Spark connector) > * TensorFlowOnSpark > * TensorFrames > * Thunder > h2. Goals: > * Simple representation of images in Spark DataFrames, based on pre-existing > industrial standards (OpenCV) > * This format should eventually allow the development of high-performance > integration points with image processing libraries such as libOpenCV, Google > TensorFlow, CNTK, and other C libraries. > * The reader should be able to read popular formats of images from > distributed sources. > h2. Non-Goals: > Images are a versatile medium and encompass a very wide range of formats and > representations. This SPIP explicitly aims at the most common use case in the > industry currently: multi-channel matrices of binary, int32, int64, float or > double data that can fit comfortably in the heap of the JVM: > * the total size of an image should be restricted to less than 2GB (roughly) > * the meaning of color channels is application-specific and is not mandated > by the standard (in line with the OpenCV standard) > * specialized formats used in meteorology, the medical field, etc. are not > supported > * this format is specialized to images and does not attempt to solve the > more general problem of representing n-dimensional tensors in Spark > h2. Proposed API changes > We propose to add a new package in the package structure, under the MLlib > project: > {{org.apache.spark.image}} > h3. Data format > We propose to add the following structure: > imageSchema = StructType([ > * StructField("mode", StringType(), False), > ** The exact representation of the data. > ** The values are described in the following OpenCV convention. Basically, > the type has both "depth" and "number of channels" info: in particular, type > "CV_8UC3" means "3 channel unsigned bytes". BGRA format would be CV_8UC4 > (value 32 in the table) with the channel order specified by convention. > ** The exact channel ordering and meaning of each channel is dictated by > convention. By default, the order is RGB (3 channels) and BGRA
[jira] [Commented] (SPARK-22666) Spark reader source for image format
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360402#comment-16360402 ] xubo245 commented on SPARK-22666: - Did it finished? or TODO > Spark reader source for image format > > > Key: SPARK-22666 > URL: https://issues.apache.org/jira/browse/SPARK-22666 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.3.0 >Reporter: Timothy Hunter >Priority: Major > > The current API for the new image format is implemented as a standalone > feature, in order to make it reside within the mllib package. As discussed in > SPARK-21866, users should be able to load images through the more common > spark source reader interface. > This ticket is concerned with adding image reading support in the spark > source API, through either of the following interfaces: > - {{spark.read.format("image")...}} > - {{spark.read.image}} > The output is a dataframe that contains images (and the file names for > example), following the semantics discussed already in SPARK-21866. > A few technical notes: > * since the functionality is implemented in {{mllib}}, calling this function > may fail at runtime if users have not imported the {{spark-mllib}} dependency > * How to deal with very flat directories? It is common to have millions of > files in a single "directory" (like in S3), which seems to have caused some > issues to some users. If this issue is too complex to handle in this ticket, > it can be dealt with separately. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23393) Path is error when run test in local machine
xubo245 created SPARK-23393: --- Summary: Path is error when run test in local machine Key: SPARK-23393 URL: https://issues.apache.org/jira/browse/SPARK-23393 Project: Spark Issue Type: Bug Components: Tests Affects Versions: 2.3.0 Reporter: xubo245 Path is error when run test in local machine: like ImageSchemaSuite -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23392) Add some test case for images feature
[ https://issues.apache.org/jira/browse/SPARK-23392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360284#comment-16360284 ] xubo245 commented on SPARK-23392: - Sorry, OK.[~hyukjin.kwon] > Add some test case for images feature > - > > Key: SPARK-23392 > URL: https://issues.apache.org/jira/browse/SPARK-23392 > Project: Spark > Issue Type: Test > Components: ML, Tests >Affects Versions: 2.3.0 >Reporter: xubo245 >Priority: Major > > Add some test case for images feature: SPARK-21866 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23392) Add some test case for images feature
xubo245 created SPARK-23392: --- Summary: Add some test case for images feature Key: SPARK-23392 URL: https://issues.apache.org/jira/browse/SPARK-23392 Project: Spark Issue Type: Test Components: MLlib, Tests Affects Versions: 2.3.0 Reporter: xubo245 Fix For: 2.4.0 Add some test case for images feature: SPARK-21866 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22624) Expose range partitioning shuffle introduced by SPARK-22614
[ https://issues.apache.org/jira/browse/SPARK-22624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326707#comment-16326707 ] xubo245 commented on SPARK-22624: - [~smilegator] ok, I will finished it. > Expose range partitioning shuffle introduced by SPARK-22614 > --- > > Key: SPARK-22624 > URL: https://issues.apache.org/jira/browse/SPARK-22624 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 2.3.0 >Reporter: Adrian Ionescu >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} No need to fix: warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning was: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } No need to fix: warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view{code} There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning > Fix improper information of TempTableAlreadyExistsException > --- > > Key: SPARK-23035 > URL: https://issues.apache.org/jira/browse/SPARK-23035 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 2.3.0 > > > > Problem: it throw TempTableAlreadyExistsException and output "Temporary table > '$table' already exists" when we create temp view by using > org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's > improper. > {code:java} > /** >* Creates a global temp view, or issue an exception if the view already > exists and >* `overrideIfExists` is false. >*/ > def create( > name: String, > viewDefinition: LogicalPlan, > overrideIfExists: Boolean): Unit = synchronized { > if (!overrideIfExists && viewDefinitions.contains(name)) { > throw new TempTableAlreadyExistsException(name) > } > viewDefinitions.put(name, viewDefinition) > } > {code} > No need to fix: > warning: TEMPORARY TABLE ... USING ... is deprecated and use > TempViewAlreadyExistsException when create temp view > There are warning when run test: test("rename temporary view - destination > table with database name") > 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE > TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW > ... USING ... instead > other test cases also have this warning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } No need to fix: warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view{code} There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning was: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} *No need to fix: * warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning > Fix improper information of TempTableAlreadyExistsException > --- > > Key: SPARK-23035 > URL: https://issues.apache.org/jira/browse/SPARK-23035 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 2.3.0 > > > > Problem: it throw TempTableAlreadyExistsException and output "Temporary table > '$table' already exists" when we create temp view by using > org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's > improper. > {code:java} > /** >* Creates a global temp view, or issue an exception if the view already > exists and >* `overrideIfExists` is false. >*/ > def create( > name: String, > viewDefinition: LogicalPlan, > overrideIfExists: Boolean): Unit = synchronized { > if (!overrideIfExists && viewDefinitions.contains(name)) { > throw new TempTableAlreadyExistsException(name) > } > viewDefinitions.put(name, viewDefinition) > } > No need to fix: > warning: TEMPORARY TABLE ... USING ... is deprecated and use > TempViewAlreadyExistsException when create temp view{code} > There are warning when run test: test("rename temporary view - destination > table with database name") > 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE > TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW > ... USING ... instead > other test cases also have this warning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} *No need to fix: * warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning was: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} *No need to fix: * warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning > Fix improper information of TempTableAlreadyExistsException > --- > > Key: SPARK-23035 > URL: https://issues.apache.org/jira/browse/SPARK-23035 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 2.3.0 > > > > Problem: it throw TempTableAlreadyExistsException and output "Temporary table > '$table' already exists" when we create temp view by using > org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's > improper. > {code:java} > /** >* Creates a global temp view, or issue an exception if the view already > exists and >* `overrideIfExists` is false. >*/ > def create( > name: String, > viewDefinition: LogicalPlan, > overrideIfExists: Boolean): Unit = synchronized { > if (!overrideIfExists && viewDefinitions.contains(name)) { > throw new TempTableAlreadyExistsException(name) > } > viewDefinitions.put(name, viewDefinition) > } > {code} > *No need to fix: > * warning: TEMPORARY TABLE ... USING ... is deprecated and use > TempViewAlreadyExistsException when create temp view > There are warning when run test: test("rename temporary view - destination > table with database name") > 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE > TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW > ... USING ... instead > other test cases also have this warning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description: Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} *No need to fix: * warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead other test cases also have this warning was: Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") {code:java} 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead {code} other test cases also have this warning Another problem, it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} > Fix improper information of TempTableAlreadyExistsException > --- > > Key: SPARK-23035 > URL: https://issues.apache.org/jira/browse/SPARK-23035 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 2.3.0 > > > > Problem: it throw TempTableAlreadyExistsException and output "Temporary table > '$table' already exists" when we create temp view by using > org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's > improper. > {code:java} > /** >* Creates a global temp view, or issue an exception if the view already > exists and >* `overrideIfExists` is false. >*/ > def create( > name: String, > viewDefinition: LogicalPlan, > overrideIfExists: Boolean): Unit = synchronized { > if (!overrideIfExists && viewDefinitions.contains(name)) { > throw new TempTableAlreadyExistsException(name) > } > viewDefinitions.put(name, viewDefinition) > } > {code} > *No need to fix: * > warning: TEMPORARY TABLE ... USING ... is deprecated and use > TempViewAlreadyExistsException when create temp view > There are warning when run test: test("rename temporary view - destination > table with database name") > 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE > TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW > ... USING ... instead > other test cases also have this warning -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Summary: Fix improper information of TempTableAlreadyExistsException (was: Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view) > Fix improper information of TempTableAlreadyExistsException > --- > > Key: SPARK-23035 > URL: https://issues.apache.org/jira/browse/SPARK-23035 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 2.3.0 > > > Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use > TempViewAlreadyExistsException when create temp view > There are warning when run test: test("rename temporary view - destination > table with database name") > {code:java} > 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE > TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW > ... USING ... instead > {code} > other test cases also have this warning > Another problem, it throw TempTableAlreadyExistsException and output > "Temporary table '$table' already exists" when we create temp view by using > org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's > improper. > {code:java} > /** >* Creates a global temp view, or issue an exception if the view already > exists and >* `overrideIfExists` is false. >*/ > def create( > name: String, > viewDefinition: LogicalPlan, > overrideIfExists: Boolean): Unit = synchronized { > if (!overrideIfExists && viewDefinitions.contains(name)) { > throw new TempTableAlreadyExistsException(name) > } > viewDefinitions.put(name, viewDefinition) > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23039) Fix the bug in alter table set location.
[ https://issues.apache.org/jira/browse/SPARK-23039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23039: Description: TOBO work: Fix the bug in alter table set location. org.apache.spark.sql.execution.command.DDLSuite#testSetLocation {code:java} // TODO(gatorsmile): fix the bug in alter table set location. //if (isUsingHiveMetastore) { //assert(storageFormat.properties.get("path") === expected) // } {code} Analysis: because user add locationUri and erase path by {code:java} newPath = None {code} in org.apache.spark.sql.hive.HiveExternalCatalog#restoreDataSourceTable: {code:java} val storageWithLocation = { val tableLocation = getLocationFromStorageProps(table) // We pass None as `newPath` here, to remove the path option in storage properties. updateLocationInStorageProps(table, newPath = None).copy( locationUri = tableLocation.map(CatalogUtils.stringToURI(_))) } {code} => newPath = None was: TOBO work: Fix the bug in alter table set location. org.apache.spark.sql.execution.command.DDLSuite#testSetLocation {code:java} // TODO(gatorsmile): fix the bug in alter table set location. //if (isUsingHiveMetastore) { //assert(storageFormat.properties.get("path") === expected) // } {code} Analysis: because user add locationUri and erase path by {code:java} newPath = None {code} in org.apache.spark.sql.hive.HiveExternalCatalog#restoreDataSourceTable: {code:java} val storageWithLocation = { val tableLocation = getLocationFromStorageProps(table) // We pass None as `newPath` here, to remove the path option in storage properties. updateLocationInStorageProps(table, newPath = None).copy( locationUri = tableLocation.map(CatalogUtils.stringToURI(_))) } {code} because " We pass None as `newPath` here, to remove the path option in storage properties." And locationUri is obtain from path in storage properties {code:java} private def getLocationFromStorageProps(table: CatalogTable): Option[String] = { CaseInsensitiveMap(table.storage.properties).get("path") } {code} So we can use locationUri instead path > Fix the bug in alter table set location. > - > > Key: SPARK-23039 > URL: https://issues.apache.org/jira/browse/SPARK-23039 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Critical > > TOBO work: Fix the bug in alter table set location. > org.apache.spark.sql.execution.command.DDLSuite#testSetLocation > {code:java} > // TODO(gatorsmile): fix the bug in alter table set location. >//if (isUsingHiveMetastore) { > //assert(storageFormat.properties.get("path") === expected) > // } > {code} > Analysis: > because user add locationUri and erase path by > {code:java} > newPath = None > {code} > in org.apache.spark.sql.hive.HiveExternalCatalog#restoreDataSourceTable: > {code:java} > val storageWithLocation = { > val tableLocation = getLocationFromStorageProps(table) > // We pass None as `newPath` here, to remove the path option in storage > properties. > updateLocationInStorageProps(table, newPath = None).copy( > locationUri = tableLocation.map(CatalogUtils.stringToURI(_))) > } > {code} > => > newPath = None -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23059) Correct some improper with view related method usage
[ https://issues.apache.org/jira/browse/SPARK-23059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324138#comment-16324138 ] xubo245 edited comment on SPARK-23059 at 1/12/18 3:41 PM: -- split from https://github.com/apache/spark/pull/20228#issuecomment-357266852, according to [~dongjoon] committer review was (Author: xubo245): split from https://github.com/apache/spark/pull/20228#issuecomment-357266852 > Correct some improper with view related method usage > > > Key: SPARK-23059 > URL: https://issues.apache.org/jira/browse/SPARK-23059 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Minor > > And correct some improper usage like: > {code:java} > test("list global temp views") { > try { > sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") > sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") > checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), > Row(globalTempDB, "v1", true) :: > Row("", "v2", true) :: Nil) > > assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == > Seq("v1", "v2")) > } finally { > spark.catalog.dropTempView("v1") > spark.catalog.dropGlobalTempView("v2") > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23059) Correct some improper with view related method usage
[ https://issues.apache.org/jira/browse/SPARK-23059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324138#comment-16324138 ] xubo245 commented on SPARK-23059: - split from https://github.com/apache/spark/pull/20228#issuecomment-357266852 > Correct some improper with view related method usage > > > Key: SPARK-23059 > URL: https://issues.apache.org/jira/browse/SPARK-23059 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Minor > > And correct some improper usage like: > {code:java} > test("list global temp views") { > try { > sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") > sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") > checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), > Row(globalTempDB, "v1", true) :: > Row("", "v2", true) :: Nil) > > assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == > Seq("v1", "v2")) > } finally { > spark.catalog.dropTempView("v1") > spark.catalog.dropGlobalTempView("v2") > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23036) Add withGlobalTempView for testing
[ https://issues.apache.org/jira/browse/SPARK-23036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23036: Summary: Add withGlobalTempView for testing (was: Add withGlobalTempView for testing and correct some improper with view related method usage) > Add withGlobalTempView for testing > -- > > Key: SPARK-23036 > URL: https://issues.apache.org/jira/browse/SPARK-23036 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Minor > > Add withGlobalTempView when create global temp view, like withTempView and > withView. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23059) Correct some improper with view related method usage
xubo245 created SPARK-23059: --- Summary: Correct some improper with view related method usage Key: SPARK-23059 URL: https://issues.apache.org/jira/browse/SPARK-23059 Project: Spark Issue Type: Bug Components: SQL, Tests Affects Versions: 2.2.1 Reporter: xubo245 Priority: Minor And correct some improper usage like: {code:java} test("list global temp views") { try { sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), Row(globalTempDB, "v1", true) :: Row("", "v2", true) :: Nil) assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == Seq("v1", "v2")) } finally { spark.catalog.dropTempView("v1") spark.catalog.dropGlobalTempView("v2") } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23036) Add withGlobalTempView for testing and correct some improper with view related method usage
[ https://issues.apache.org/jira/browse/SPARK-23036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23036: Description: Add withGlobalTempView when create global temp view, like withTempView and withView. was: Add withGlobalTempView when create global temp view, like withTempView and withView. And correct some improper usage like: {code:java} test("list global temp views") { try { sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), Row(globalTempDB, "v1", true) :: Row("", "v2", true) :: Nil) assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == Seq("v1", "v2")) } finally { spark.catalog.dropTempView("v1") spark.catalog.dropGlobalTempView("v2") } } {code} > Add withGlobalTempView for testing and correct some improper with view > related method usage > --- > > Key: SPARK-23036 > URL: https://issues.apache.org/jira/browse/SPARK-23036 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Minor > > Add withGlobalTempView when create global temp view, like withTempView and > withView. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23057) SET LOCATION should change the path of partition in table
[ https://issues.apache.org/jira/browse/SPARK-23057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23057: Summary: SET LOCATION should change the path of partition in table (was: SET LOCATION should change the path of partition in tabl) > SET LOCATION should change the path of partition in table > - > > Key: SPARK-23057 > URL: https://issues.apache.org/jira/browse/SPARK-23057 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 >Priority: Minor > > According to https://issues.apache.org/jira/browse/SPARK-19235 and > https://github.com/apache/spark/pull/16592#pullrequestreview-88085571, > {code:java} > When porting these test cases, a bug of SET LOCATION is found. path is not > set when the location is changed. > {code} > in org.apache.spark.sql.execution.command.DDLSuite#testSetLocation: > {code:java} > // TODO(gatorsmile): fix the bug in alter table set location. > // if (isUsingHiveMetastore) { > // assert(storageFormat.properties.get("path") === expected) > // } > {code} > So test it: > There is a error in : > {code:java} > // set table partition location > sql("ALTER TABLE dbx.tab1 PARTITION (a='1', b='2') SET LOCATION > '/path/to/part/ways'") > verifyLocation(new URI("/path/to/part/ways"), Some(partSpec)) > {code} > I also add test cases: > {code:java} > test("SET LOCATION should change the path of partition in tabl") { > withTable("boxes") { > sql("CREATE TABLE boxes (height INT, length INT) PARTITIONED BY (width > INT) LOCATION '/new'") > sql("INSERT OVERWRITE TABLE boxes PARTITION (width=4) SELECT 4, 4") > val expected = "/path/to/part/ways" > sql(s"ALTER TABLE boxes PARTITION (width=4) SET LOCATION '$expected'") > val catalog = spark.sessionState.catalog > val partSpec = Map("width" -> "4") > val spec = Some(partSpec) > val tableIdent = TableIdentifier("boxes", Some("default")) > val storageFormat = spec > .map { s => catalog.getPartition(tableIdent, s).storage } > .getOrElse { > catalog.getTableMetadata(tableIdent).storage > } > assert(storageFormat.properties.get("path").get === expected) > } > } > {code} > Error: > {code:java} > 05:46:48.213 WARN org.apache.hadoop.hive.metastore.ObjectStore: Failed to get > database global_temp, returning NoSuchObjectException > None.get > java.util.NoSuchElementException: None.get > at scala.None$.get(Option.scala:347) > at scala.None$.get(Option.scala:345) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32$$anonfun$apply$mcV$sp$22.apply$mcV$sp(HiveDDLSuite.scala:768) > at > org.apache.spark.sql.test.SQLTestUtilsBase$class.withTable(SQLTestUtils.scala:273) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:261) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply$mcV$sp(HiveDDLSuite.scala:754) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) > at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(HiveDDLSuite.scala:261) > at > org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:221) > at > org.apache.spark.sql.hive.execution.HiveDDLSuite.runTest(HiveDDLSuite.scala:261) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) > at
[jira] [Updated] (SPARK-23057) SET LOCATION should change the path of partition in tabl
[ https://issues.apache.org/jira/browse/SPARK-23057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23057: Description: According to https://issues.apache.org/jira/browse/SPARK-19235 and https://github.com/apache/spark/pull/16592#pullrequestreview-88085571, {code:java} When porting these test cases, a bug of SET LOCATION is found. path is not set when the location is changed. {code} in org.apache.spark.sql.execution.command.DDLSuite#testSetLocation: {code:java} // TODO(gatorsmile): fix the bug in alter table set location. // if (isUsingHiveMetastore) { // assert(storageFormat.properties.get("path") === expected) // } {code} So test it: There is a error in : {code:java} // set table partition location sql("ALTER TABLE dbx.tab1 PARTITION (a='1', b='2') SET LOCATION '/path/to/part/ways'") verifyLocation(new URI("/path/to/part/ways"), Some(partSpec)) {code} I also add test cases: {code:java} test("SET LOCATION should change the path of partition in tabl") { withTable("boxes") { sql("CREATE TABLE boxes (height INT, length INT) PARTITIONED BY (width INT) LOCATION '/new'") sql("INSERT OVERWRITE TABLE boxes PARTITION (width=4) SELECT 4, 4") val expected = "/path/to/part/ways" sql(s"ALTER TABLE boxes PARTITION (width=4) SET LOCATION '$expected'") val catalog = spark.sessionState.catalog val partSpec = Map("width" -> "4") val spec = Some(partSpec) val tableIdent = TableIdentifier("boxes", Some("default")) val storageFormat = spec .map { s => catalog.getPartition(tableIdent, s).storage } .getOrElse { catalog.getTableMetadata(tableIdent).storage } assert(storageFormat.properties.get("path").get === expected) } } {code} Error: {code:java} 05:46:48.213 WARN org.apache.hadoop.hive.metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException None.get java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347) at scala.None$.get(Option.scala:345) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32$$anonfun$apply$mcV$sp$22.apply$mcV$sp(HiveDDLSuite.scala:768) at org.apache.spark.sql.test.SQLTestUtilsBase$class.withTable(SQLTestUtils.scala:273) at org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:261) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply$mcV$sp(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196) at org.apache.spark.sql.hive.execution.HiveDDLSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(HiveDDLSuite.scala:261) at org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:221) at org.apache.spark.sql.hive.execution.HiveDDLSuite.runTest(HiveDDLSuite.scala:261) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) at scala.collection.immutable.List.foreach(List.scala:381) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) at org.scalatest.Suite$class.run(Suite.scala:1147) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) at
[jira] [Updated] (SPARK-23057) SET LOCATION should change the path of partition in tabl
[ https://issues.apache.org/jira/browse/SPARK-23057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23057: Description: According to https://issues.apache.org/jira/browse/SPARK-19235 and https://github.com/apache/spark/pull/16592#pullrequestreview-88085571, {code:java} When porting these test cases, a bug of SET LOCATION is found. path is not set when the location is changed. {code} in org.apache.spark.sql.execution.command.DDLSuite#testSetLocation: {code:java} // TODO(gatorsmile): fix the bug in alter table set location. // if (isUsingHiveMetastore) { // assert(storageFormat.properties.get("path") === expected) // } {code} So test it: Test case: {code:java} test("SET LOCATION should change the path of partition in tabl") { withTable("boxes") { sql("CREATE TABLE boxes (height INT, length INT) PARTITIONED BY (width INT) LOCATION '/new'") sql("INSERT OVERWRITE TABLE boxes PARTITION (width=4) SELECT 4, 4") val expected = "/path/to/part/ways" sql(s"ALTER TABLE boxes PARTITION (width=4) SET LOCATION '$expected'") val catalog = spark.sessionState.catalog val partSpec = Map("width" -> "4") val spec = Some(partSpec) val tableIdent = TableIdentifier("boxes", Some("default")) val storageFormat = spec .map { s => catalog.getPartition(tableIdent, s).storage } .getOrElse { catalog.getTableMetadata(tableIdent).storage } assert(storageFormat.properties.get("path").get === expected) } } {code} Error: {code:java} 05:46:48.213 WARN org.apache.hadoop.hive.metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException None.get java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347) at scala.None$.get(Option.scala:345) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32$$anonfun$apply$mcV$sp$22.apply$mcV$sp(HiveDDLSuite.scala:768) at org.apache.spark.sql.test.SQLTestUtilsBase$class.withTable(SQLTestUtils.scala:273) at org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:261) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply$mcV$sp(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196) at org.apache.spark.sql.hive.execution.HiveDDLSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(HiveDDLSuite.scala:261) at org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:221) at org.apache.spark.sql.hive.execution.HiveDDLSuite.runTest(HiveDDLSuite.scala:261) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) at scala.collection.immutable.List.foreach(List.scala:381) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) at org.scalatest.Suite$class.run(Suite.scala:1147) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) at org.scalatest.SuperEngine.runImpl(Engine.scala:521) at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) at
[jira] [Created] (SPARK-23057) SET LOCATION should change the path of partition in tabl
xubo245 created SPARK-23057: --- Summary: SET LOCATION should change the path of partition in tabl Key: SPARK-23057 URL: https://issues.apache.org/jira/browse/SPARK-23057 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Reporter: xubo245 Priority: Minor Test case: {code:java} test("SET LOCATION should change the path of partition in tabl") { withTable("boxes") { sql("CREATE TABLE boxes (height INT, length INT) PARTITIONED BY (width INT) LOCATION '/new'") sql("INSERT OVERWRITE TABLE boxes PARTITION (width=4) SELECT 4, 4") val expected = "/path/to/part/ways" sql(s"ALTER TABLE boxes PARTITION (width=4) SET LOCATION '$expected'") val catalog = spark.sessionState.catalog val partSpec = Map("width" -> "4") val spec = Some(partSpec) val tableIdent = TableIdentifier("boxes", Some("default")) val storageFormat = spec .map { s => catalog.getPartition(tableIdent, s).storage } .getOrElse { catalog.getTableMetadata(tableIdent).storage } assert(storageFormat.properties.get("path").get === expected) } } {code} Error: {code:java} 05:46:48.213 WARN org.apache.hadoop.hive.metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException None.get java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347) at scala.None$.get(Option.scala:345) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32$$anonfun$apply$mcV$sp$22.apply$mcV$sp(HiveDDLSuite.scala:768) at org.apache.spark.sql.test.SQLTestUtilsBase$class.withTable(SQLTestUtils.scala:273) at org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:261) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply$mcV$sp(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.apache.spark.sql.hive.execution.HiveDDLSuite$$anonfun$32.apply(HiveDDLSuite.scala:754) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196) at org.apache.spark.sql.hive.execution.HiveDDLSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(HiveDDLSuite.scala:261) at org.scalatest.BeforeAndAfterEach$class.runTest(BeforeAndAfterEach.scala:221) at org.apache.spark.sql.hive.execution.HiveDDLSuite.runTest(HiveDDLSuite.scala:261) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384) at scala.collection.immutable.List.foreach(List.scala:381) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229) at org.scalatest.FunSuite.runTests(FunSuite.scala:1560) at org.scalatest.Suite$class.run(Suite.scala:1147) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233) at org.scalatest.SuperEngine.runImpl(Engine.scala:521) at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233) at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31) at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213) at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210) at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
[jira] [Created] (SPARK-23039) Fix the bug in alter table set location.
xubo245 created SPARK-23039: --- Summary: Fix the bug in alter table set location. Key: SPARK-23039 URL: https://issues.apache.org/jira/browse/SPARK-23039 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Reporter: xubo245 Priority: Critical TOBO work: Fix the bug in alter table set location. org.apache.spark.sql.execution.command.DDLSuite#testSetLocation {code:java} // TODO(gatorsmile): fix the bug in alter table set location. //if (isUsingHiveMetastore) { //assert(storageFormat.properties.get("path") === expected) // } {code} Analysis: because user add locationUri and erase path by {code:java} newPath = None {code} in org.apache.spark.sql.hive.HiveExternalCatalog#restoreDataSourceTable: {code:java} val storageWithLocation = { val tableLocation = getLocationFromStorageProps(table) // We pass None as `newPath` here, to remove the path option in storage properties. updateLocationInStorageProps(table, newPath = None).copy( locationUri = tableLocation.map(CatalogUtils.stringToURI(_))) } {code} because " We pass None as `newPath` here, to remove the path option in storage properties." And locationUri is obtain from path in storage properties {code:java} private def getLocationFromStorageProps(table: CatalogTable): Option[String] = { CaseInsensitiveMap(table.storage.properties).get("path") } {code} So we can use locationUri instead path -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23036) Add withGlobalTempView for testing and correct some improper with view related method usage
[ https://issues.apache.org/jira/browse/SPARK-23036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23036: Summary: Add withGlobalTempView for testing and correct some improper with view related method usage (was: Add withGlobalTempView for testing) > Add withGlobalTempView for testing and correct some improper with view > related method usage > --- > > Key: SPARK-23036 > URL: https://issues.apache.org/jira/browse/SPARK-23036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 >Reporter: xubo245 > > Add withGlobalTempView when create global temp view, like withTempView and > withView. > And correct some improper usage like: > {code:java} > test("list global temp views") { > try { > sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") > sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") > checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), > Row(globalTempDB, "v1", true) :: > Row("", "v2", true) :: Nil) > > assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == > Seq("v1", "v2")) > } finally { > spark.catalog.dropTempView("v1") > spark.catalog.dropGlobalTempView("v2") > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23036) Add withGlobalTempView for testing
xubo245 created SPARK-23036: --- Summary: Add withGlobalTempView for testing Key: SPARK-23036 URL: https://issues.apache.org/jira/browse/SPARK-23036 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Reporter: xubo245 Add withGlobalTempView when create global temp view, like withTempView and withView. And correct some improper usage like: {code:java} test("list global temp views") { try { sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4") sql("CREATE TEMP VIEW v2 AS SELECT 1, 2") checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"), Row(globalTempDB, "v1", true) :: Row("", "v2", true) :: Nil) assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == Seq("v1", "v2")) } finally { spark.catalog.dropTempView("v1") spark.catalog.dropGlobalTempView("v2") } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23035) Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view
xubo245 created SPARK-23035: --- Summary: Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view Key: SPARK-23035 URL: https://issues.apache.org/jira/browse/SPARK-23035 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Reporter: xubo245 Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view There are warning when run test: test("rename temporary view - destination table with database name") {code:java} 02:11:38.136 WARN org.apache.spark.sql.execution.SparkSqlAstBuilder: CREATE TEMPORARY TABLE ... USING ... is deprecated, please use CREATE TEMPORARY VIEW ... USING ... instead {code} other test cases also have this warning Another problem, it throw TempTableAlreadyExistsException and output "Temporary table '$table' already exists" when we create temp view by using org.apache.spark.sql.catalyst.catalog.GlobalTempViewManager#create, it's improper. {code:java} /** * Creates a global temp view, or issue an exception if the view already exists and * `overrideIfExists` is false. */ def create( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit = synchronized { if (!overrideIfExists && viewDefinitions.contains(name)) { throw new TempTableAlreadyExistsException(name) } viewDefinitions.put(name, viewDefinition) } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22972) Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc.
xubo245 created SPARK-22972: --- Summary: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. Key: SPARK-22972 URL: https://issues.apache.org/jira/browse/SPARK-22972 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Reporter: xubo245 *There is error when running test code:* {code:java} test("create orc table") { spark.sql( s"""CREATE TABLE normal_orc_as_source_hive |USING org.apache.spark.sql.hive.orc |OPTIONS ( | PATH '${new File(orcTableAsDir.getAbsolutePath).toURI}' |) """.stripMargin) val df = spark.sql("select * from normal_orc_as_source_hive") spark.sql("desc formatted normal_orc_as_source_hive").show() } {code} *warning:* {code:java} 05:00:44.038 WARN org.apache.spark.sql.hive.test.TestHiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider org.apache.spark.sql.hive.orc. Persisting data source table `default`.`normal_orc_as_source_hive` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive. {code} Root cause analysis: ORC related code is incorrect in HiveSerDe : {code:java} org.apache.spark.sql.internal.HiveSerDe#sourceToSerDe {code} {code:java} def sourceToSerDe(source: String): Option[HiveSerDe] = { val key = source.toLowerCase(Locale.ROOT) match { case s if s.startsWith("org.apache.spark.sql.parquet") => "parquet" case s if s.startsWith("org.apache.spark.sql.orc") => "orc" case s if s.equals("orcfile") => "orc" case s if s.equals("parquetfile") => "parquet" case s if s.equals("avrofile") => "avro" case s => s } {code} Solution: change "org.apache.spark.sql.orc“ to "org.apache.spark.sql.hive.orc" -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22857) Optimize code by inspecting code
xubo245 created SPARK-22857: --- Summary: Optimize code by inspecting code Key: SPARK-22857 URL: https://issues.apache.org/jira/browse/SPARK-22857 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 2.2.1 Reporter: xubo245 Priority: Minor Optimize code by inspecting code -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22423) Scala test source files like TestHiveSingleton.scala should be in scala source root
[ https://issues.apache.org/jira/browse/SPARK-22423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237176#comment-16237176 ] xubo245 commented on SPARK-22423: - OK, I will fix it. > Scala test source files like TestHiveSingleton.scala should be in scala > source root > --- > > Key: SPARK-22423 > URL: https://issues.apache.org/jira/browse/SPARK-22423 > Project: Spark > Issue Type: Test > Components: Tests >Affects Versions: 2.2.0 >Reporter: xubo245 >Priority: Minor > > The TestHiveSingleton.scala file should be in scala directory, not in java > directory -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22423) The TestHiveSingleton.scala file should be in scala directory
xubo245 created SPARK-22423: --- Summary: The TestHiveSingleton.scala file should be in scala directory Key: SPARK-22423 URL: https://issues.apache.org/jira/browse/SPARK-22423 Project: Spark Issue Type: Test Components: Tests Affects Versions: 2.2.0 Reporter: xubo245 Priority: Minor The TestHiveSingleton.scala file should be in scala directory, not in java directory -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-18435) The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server.
[ https://issues.apache.org/jira/browse/SPARK-18435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 closed SPARK-18435. --- > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > -- > > Key: SPARK-18435 > URL: https://issues.apache.org/jira/browse/SPARK-18435 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.0.1 > Environment: spark-2.0.1 >Reporter: xubo245 >Priority: Trivial > > when request AccumulatorV2 link in > http://spark.apache.org/docs/latest/programming-guide.html,there are error: > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > the link error! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-18435) The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server.
[ https://issues.apache.org/jira/browse/SPARK-18435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 closed SPARK-18435. --- Resolution: Fixed > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > -- > > Key: SPARK-18435 > URL: https://issues.apache.org/jira/browse/SPARK-18435 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.0.1 > Environment: spark-2.0.1 >Reporter: xubo245 > > when request AccumulatorV2 link in > http://spark.apache.org/docs/latest/programming-guide.html,there are error: > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > the link error! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18435) The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server.
[ https://issues.apache.org/jira/browse/SPARK-18435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663957#comment-15663957 ] xubo245 commented on SPARK-18435: - It has fix in latest code :https://github.com/apache/spark/blob/master/docs/programming-guide.md > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > -- > > Key: SPARK-18435 > URL: https://issues.apache.org/jira/browse/SPARK-18435 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 2.0.1 > Environment: spark-2.0.1 >Reporter: xubo245 > > when request AccumulatorV2 link in > http://spark.apache.org/docs/latest/programming-guide.html,there are error: > The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html > was not found on this server. > the link error! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18435) The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server.
xubo245 created SPARK-18435: --- Summary: The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server. Key: SPARK-18435 URL: https://issues.apache.org/jira/browse/SPARK-18435 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 2.0.1 Environment: spark-2.0.1 Reporter: xubo245 when request AccumulatorV2 link in http://spark.apache.org/docs/latest/programming-guide.html,there are error: The requested URL /docs/latest/api/scala/org/apache/spark/AccumulatorV2.html was not found on this server. the link error! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18275) Why does not use an ordered queue in takeOrdered?
xubo245 created SPARK-18275: --- Summary: Why does not use an ordered queue in takeOrdered? Key: SPARK-18275 URL: https://issues.apache.org/jira/browse/SPARK-18275 Project: Spark Issue Type: Question Components: Spark Core Affects Versions: 2.0.1 Reporter: xubo245 Priority: Minor Every partition in mapRDDs is defined as BoundedPriorityQueue object : val queue = new BoundedPriorityQueue[T](num)(ord.reverse) in org.apache.spark.rdd.RDD#takeOrdered After mapRDDs.reduce,only return one queue and the queue is BoundedPriorityQueue ,so after toArray , Is it necessary to use sorted in takeOrdered? If we can keep the queue is ordered,we can only use reverse The same as in org.apache.spark.util.collection.Utils#takeOrdered, the leastOf method also use a unordered buffer ,why does not use a ordered queue? we can insert a num in O(log k) ,but the traditional quickselect algorithm take O(k) time. Also we do not need a sort after selecting and save O(k * log k) the leastOf is call com.google.common.collect.Ordering#leastOf(java.util.Iterator, int) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410923#comment-15410923 ] xubo245 commented on SPARK-15575: - If remove breeze dependency,we need rewrite similar project? I think we can build mllib linalg project ,and which dependencies breeze or other library. Also we can update the project if breeze can not support scala2.12. > Remove breeze from dependencies? > > > Key: SPARK-15575 > URL: https://issues.apache.org/jira/browse/SPARK-15575 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley > > This JIRA is for discussing whether we should remove Breeze from the > dependencies of MLlib. The main issues with Breeze are Scala 2.12 support > and performance issues. > There are a few paths: > # Keep dependency. This could be OK, especially if the Scala version issues > are fixed within Breeze. > # Remove dependency > ## Implement our own linear algebra operators as needed > ## Design a way to build Spark using custom linalg libraries of the user's > choice. E.g., you could build MLlib using Breeze, or any other library > supporting the required operations. This might require significant work. > See [SPARK-6442] for related discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org