[GitHub] [spark] dongjoon-hyun commented on pull request #36863: [SPARK-39459][CORE] `local*HostName*` methods should support `IPv6`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36863: URL: https://github.com/apache/spark/pull/36863#issuecomment-1154793223 Just FYI, `scala.io.Source.fromURL` seems to be unable to support IPv6 for now. I'm digging that part. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] cxzl25 commented on a diff in pull request #36740: [SPARK-39355][SQL] Avoid UnresolvedAttribute.apply throwing ParseException

2022-06-13 Thread GitBox
cxzl25 commented on code in PR #36740: URL: https://github.com/apache/spark/pull/36740#discussion_r896444180 ## sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala: ## @@ -2176,4 +2176,32 @@ class SubquerySuite extends QueryTest with SharedSparkSession with Adaptiv

[GitHub] [spark] dongjoon-hyun commented on pull request #36863: [SPARK-39459][CORE] `local*HostName*` methods should support `IPv6`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36863: URL: https://github.com/apache/spark/pull/36863#issuecomment-1154791746 There are totally independent from this patch. I'm working on them, too, @LuciferYang . :) -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] LuciferYang commented on pull request #36863: [SPARK-39459][CORE] `local*HostName*` methods should support `IPv6`

2022-06-13 Thread GitBox
LuciferYang commented on PR #36863: URL: https://github.com/apache/spark/pull/36863#issuecomment-1154784120 - RocksDBBackendHistoryServerSuite ``` - application list json *** FAILED *** java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSoc

[GitHub] [spark] LuciferYang commented on pull request #36863: [SPARK-39459][CORE] `local*HostName*` methods should support `IPv6`

2022-06-13 Thread GitBox
LuciferYang commented on PR #36863: URL: https://github.com/apache/spark/pull/36863#issuecomment-1154782012 I found other failed cases, like - UISeleniumSuite ``` - effects of unpersist() / persist() should be reflected *** FAILED *** java.net.ConnectException: Connecti

[GitHub] [spark] MaxGekk commented on a diff in pull request #36855: [SPARK-39432][SQL] element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT

2022-06-13 Thread GitBox
MaxGekk commented on code in PR #36855: URL: https://github.com/apache/spark/pull/36855#discussion_r896426977 ## sql/core/src/test/resources/sql-tests/results/ansi/array.sql.out: ## @@ -191,8 +191,8 @@ select element_at(array(1, 2, 3), 0) -- !query schema struct<> -- !query o

[GitHub] [spark] dongjoon-hyun commented on pull request #36864: [SPARK-39463][CORE][TESTS] Use `UUID` for test database location in `JavaJdbcRDDSuite`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36864: URL: https://github.com/apache/spark/pull/36864#issuecomment-1154773786 Thank you, @huaxingao . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] beliefer commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
beliefer commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154758012 @sadikovi You can run the test case I added above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-13 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r896408996 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1069,25 +1084,56 @@ private[spark] class TaskSetManager( * Check if the task associate

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-13 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r896407476 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1069,25 +1084,56 @@ private[spark] class TaskSetManager( * Check if the task associate

[GitHub] [spark] sadikovi commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
sadikovi commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154752252 @beliefer Maybe we can address your concerns in the follow-up work, what do you think? We can open a follow-up ticket and try to polish the implementation - it is not perfect by any mean

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-13 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r896402450 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1069,25 +1084,56 @@ private[spark] class TaskSetManager( * Check if the task associate

[GitHub] [spark] sadikovi commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
sadikovi commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154747718 I think we can. JDBC dialects can configure how they map TimestampNTZ type. In the case you mentioned, both timestamps will be read as timestamp_ntz in MySQL and Postgres. In fact, the

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-13 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r896400772 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1069,25 +1084,56 @@ private[spark] class TaskSetManager( * Check if the task associate

[GitHub] [spark] ivoson commented on pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-13 Thread GitBox
ivoson commented on PR #36716: URL: https://github.com/apache/spark/pull/36716#issuecomment-1154747061 > I kind of disagree because it doesn't work as expected compared to other resource manager. This to me is very confusing. I kind of hate to add more features on what I would consider a br

[GitHub] [spark] beliefer commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
beliefer commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154742731 > I updated the test case as you suggested and it passes on my machine. Can you share the error message? It also passed the build. ``` == Results == !== Correct Answer - 1 ==

[GitHub] [spark] beliefer commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
beliefer commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154742475 I think we can't support timestamp ntz with the option. We should let JDBC dialect to decide how to supports timestamp ntz. If one table have ts1 is timestamp and ts2 is timestamp n

[GitHub] [spark] MaxGekk closed pull request #36852: [SPARK-38700][SQL][3.3] Use error classes in the execution errors of save mode

2022-06-13 Thread GitBox
MaxGekk closed pull request #36852: [SPARK-38700][SQL][3.3] Use error classes in the execution errors of save mode URL: https://github.com/apache/spark/pull/36852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] sadikovi commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
sadikovi commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154740681 I updated the test case as you suggested and it passes on my machine. Can you share the error message? It also passed the build. -- This is an automated message from the Apache Git Ser

[GitHub] [spark] MaxGekk commented on pull request #36852: [SPARK-38700][SQL][3.3] Use error classes in the execution errors of save mode

2022-06-13 Thread GitBox
MaxGekk commented on PR #36852: URL: https://github.com/apache/spark/pull/36852#issuecomment-1154740365 +1, LGTM. Merging to 3.3. Thank you, @panbingkun and @dongjoon-hyun for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [spark] gengliangwang commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
gengliangwang commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154737984 @beliefer Your test case is testing ORC, while this PR is about JDBC... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36864: [SPARK-39463][CORE][TESTS] Use `UUID` for test database location in `JavaJdbcRDDSuite`

2022-06-13 Thread GitBox
dongjoon-hyun commented on code in PR #36864: URL: https://github.com/apache/spark/pull/36864#discussion_r896391807 ## core/src/test/java/org/apache/spark/JavaJdbcRDDSuite.java: ## @@ -32,6 +33,7 @@ import org.junit.Test; public class JavaJdbcRDDSuite implements Serializable

[GitHub] [spark] beliefer commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
beliefer commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154735759 > Thanks, merging to master I update this test case and it will fail ! ``` test("SPARK-37463: read/write Timestamp ntz to Orc with different time zone") { DateTimeTe

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36864: [SPARK-39463][CORE][TESTS] Use `UUID` for test database location in `JavaJdbcRDDSuite`

2022-06-13 Thread GitBox
dongjoon-hyun commented on code in PR #36864: URL: https://github.com/apache/spark/pull/36864#discussion_r896389409 ## core/src/test/java/org/apache/spark/JavaJdbcRDDSuite.java: ## @@ -32,6 +33,7 @@ import org.junit.Test; public class JavaJdbcRDDSuite implements Serializable

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36864: [SPARK-39463][CORE][TESTS] Use `UUID` for test database location in `JavaJdbcRDDSuite`

2022-06-13 Thread GitBox
HyukjinKwon commented on code in PR #36864: URL: https://github.com/apache/spark/pull/36864#discussion_r896378643 ## core/src/test/java/org/apache/spark/JavaJdbcRDDSuite.java: ## @@ -32,6 +33,7 @@ import org.junit.Test; public class JavaJdbcRDDSuite implements Serializable {

[GitHub] [spark] Ngone51 commented on pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-13 Thread GitBox
Ngone51 commented on PR #36716: URL: https://github.com/apache/spark/pull/36716#issuecomment-1154714325 @tgravescs Thanks for your feedback. I agree it's a kind of dirty way to build new features upon the features with known issues, which hides the issues deeply in further. I'm also +1 on t

[GitHub] [spark] AmplabJenkins commented on pull request #36855: [SPARK-39432][SQL] element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT

2022-06-13 Thread GitBox
AmplabJenkins commented on PR #36855: URL: https://github.com/apache/spark/pull/36855#issuecomment-1154710152 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] ulysses-you commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
ulysses-you commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r896371376 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: Str

[GitHub] [spark] gengliangwang closed pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
gengliangwang closed pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source URL: https://github.com/apache/spark/pull/36726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] gengliangwang commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-06-13 Thread GitBox
gengliangwang commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1154692745 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36864: [SPARK-39463][CORE][TESTS] Use `UUID` for test database location in `JavaJdbcRDDSuite`

2022-06-13 Thread GitBox
dongjoon-hyun opened a new pull request, #36864: URL: https://github.com/apache/spark/pull/36864 ### What changes were proposed in this pull request? This PR aims to use UUID instead of a fixed test database location in `JavaJdbcRDDSuite`. ### Why are the changes needed?

[GitHub] [spark] LuciferYang commented on pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
LuciferYang commented on PR #36856: URL: https://github.com/apache/spark/pull/36856#issuecomment-1154677592 Are there any other similar cases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on pull request #36863: [SPARK-39459][CORE] `localHostName*` methods should support `IPv6`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36863: URL: https://github.com/apache/spark/pull/36863#issuecomment-1154672837 cc @dbtsai -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
srowen commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r896341536 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: String =

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36863: [SPARK-39459][CORE] localHostName* methods should support IPv6

2022-06-13 Thread GitBox
dongjoon-hyun opened a new pull request, #36863: URL: https://github.com/apache/spark/pull/36863 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

[GitHub] [spark] ulysses-you commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
ulysses-you commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r89692 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: Str

[GitHub] [spark] ulysses-you commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
ulysses-you commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r89692 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: Str

[GitHub] [spark] dongjoon-hyun closed pull request #36862: [SPARK-39461][INFRA] Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/(mvn|sbt)`

2022-06-13 Thread GitBox
dongjoon-hyun closed pull request #36862: [SPARK-39461][INFRA] Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/(mvn|sbt)` URL: https://github.com/apache/spark/pull/36862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun commented on pull request #36862: [SPARK-39461][INFRA] Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/(mvn|sbt)`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36862: URL: https://github.com/apache/spark/pull/36862#issuecomment-1154655021 Thank you so much, @sunchao ! Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] mridulm commented on pull request #35683: [SPARK-30835][SPARK-39018][CORE][YARN] Add support for YARN decommissioning when ESS is disabled

2022-06-13 Thread GitBox
mridulm commented on PR #35683: URL: https://github.com/apache/spark/pull/35683#issuecomment-1154652812 Merged to master. Thanks for working on this @abhishekd0907 ! Thanks for the review @attilapiros :-) -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] mridulm closed pull request #35683: [SPARK-30835][SPARK-39018][CORE][YARN] Add support for YARN decommissioning when ESS is disabled

2022-06-13 Thread GitBox
mridulm closed pull request #35683: [SPARK-30835][SPARK-39018][CORE][YARN] Add support for YARN decommissioning when ESS is disabled URL: https://github.com/apache/spark/pull/35683 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] srowen commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
srowen commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r896325472 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: String =

[GitHub] [spark] beliefer commented on pull request #36830: [SPARK-39453][SQL] DS V2 supports push down misc non-aggregate functions(non ANSI)

2022-06-13 Thread GitBox
beliefer commented on PR #36830: URL: https://github.com/apache/spark/pull/36830#issuecomment-1154641204 > So the statement here [#36039 (comment)](https://github.com/apache/spark/pull/36039#issuecomment-1089567136) is not true any more, we will push down both ANSI functions and commonly us

[GitHub] [spark] huaxingao commented on pull request #36830: [SPARK-39453][SQL] DS V2 supports push down misc non-aggregate functions(non ANSI)

2022-06-13 Thread GitBox
huaxingao commented on PR #36830: URL: https://github.com/apache/spark/pull/36830#issuecomment-1154634939 So the statement here https://github.com/apache/spark/pull/36039#issuecomment-1089567136 is not true any more, we will push down both ANSI functions and commonly used non-ANSI function

[GitHub] [spark] AngersZhuuuu commented on pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-13 Thread GitBox
AngersZh commented on PR #36564: URL: https://github.com/apache/spark/pull/36564#issuecomment-1154633468 > LGTM if tests pass GA failed not related to this PR ``` __w/spark/spark/docs/_plugins/copy_api_dirs.rb:130:in `': Python doc generation failed (RuntimeError) 202

[GitHub] [spark] AngersZhuuuu commented on a diff in pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2022-06-13 Thread GitBox
AngersZh commented on code in PR #36564: URL: https://github.com/apache/spark/pull/36564#discussion_r896311302 ## core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala: ## @@ -270,6 +263,16 @@ class OutputCommitCoordinatorSuite extends SparkFunSui

[GitHub] [spark] ulysses-you commented on a diff in pull request #36785: [SPARK-39397][SQL] Relax AliasAwareOutputExpression to support alias with expression

2022-06-13 Thread GitBox
ulysses-you commented on code in PR #36785: URL: https://github.com/apache/spark/pull/36785#discussion_r896303277 ## sql/core/src/main/scala/org/apache/spark/sql/execution/AliasAwareOutputExpression.scala: ## @@ -25,15 +25,15 @@ import org.apache.spark.sql.catalyst.plans.physic

[GitHub] [spark] ulysses-you commented on a diff in pull request #36856: [SPARK-39455][SQL] Improve expression non-codegen code path performance by cache data type matching

2022-06-13 Thread GitBox
ulysses-you commented on code in PR #36856: URL: https://github.com/apache/spark/pull/36856#discussion_r896301945 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -53,6 +53,17 @@ case class UnaryMinus( override def toString: Str

[GitHub] [spark] xiuzhu9527 commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-13 Thread GitBox
xiuzhu9527 commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1154621166 @wangyum @pan3793 Hi, What should we do next?Now I am very confused about whether this problem is allowed to be fixed -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #36859: DTW: new distance measure for clustering

2022-06-13 Thread GitBox
AmplabJenkins commented on PR #36859: URL: https://github.com/apache/spark/pull/36859#issuecomment-1154620125 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
AmplabJenkins commented on PR #36861: URL: https://github.com/apache/spark/pull/36861#issuecomment-1154620110 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on a diff in pull request #36740: [SPARK-39355][SQL] Avoid UnresolvedAttribute.apply throwing ParseException

2022-06-13 Thread GitBox
cloud-fan commented on code in PR #36740: URL: https://github.com/apache/spark/pull/36740#discussion_r896290278 ## sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala: ## @@ -2176,4 +2176,32 @@ class SubquerySuite extends QueryTest with SharedSparkSession with Adap

[GitHub] [spark] cloud-fan commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
cloud-fan commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896288305 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -287,16 +294,44 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] panbingkun commented on pull request #36852: [SPARK-38700][SQL][3.3] Use error classes in the execution errors of save mode

2022-06-13 Thread GitBox
panbingkun commented on PR #36852: URL: https://github.com/apache/spark/pull/36852#issuecomment-1154604668 > @panbingkun Could you fix the test failure, please: > > ``` > QueryExecutionErrorsSuite.UNSUPPORTED_SAVE_MODE: unsupported null saveMode whether the path exists or not >

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896287031 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -317,22 +353,24 @@ public void applicationRemoved(String a

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896286905 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -655,6 +742,206 @@ public void registerExecutor(String app

[GitHub] [spark] cloud-fan commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
cloud-fan commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896286734 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -250,8 +251,18 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896286777 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -643,8 +725,13 @@ public void registerExecutor(String appI

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896286599 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -342,6 +380,33 @@ void closeAndDeletePartitionFilesIfNeede

[GitHub] [spark] cloud-fan commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
cloud-fan commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896286460 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -250,8 +251,18 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896286443 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -655,6 +742,206 @@ public void registerExecutor(String app

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896286390 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -655,6 +742,206 @@ public void registerExecutor(String app

[GitHub] [spark] cloud-fan commented on pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
cloud-fan commented on PR #36861: URL: https://github.com/apache/spark/pull/36861#issuecomment-1154596720 @dtenedor for bug fix, we should create a new JIRA ticket instead of reusing the original one... -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] cloud-fan closed pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
cloud-fan closed pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers URL: https://github.com/apache/spark/pull/36861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [spark] cloud-fan commented on pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
cloud-fan commented on PR #36861: URL: https://github.com/apache/spark/pull/36861#issuecomment-1154596074 thanks, merging to master/3.3! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896281336 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -230,11 +241,14 @@ protected void serviceInit(Configuration externalC

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896279364 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -350,15 +415,27 @@ void closeAndDeletePartitionFilesIfNeed

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-13 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r896274975 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -317,22 +353,24 @@ public void applicationRemoved(String a

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36841: [SPARK-39444][SQL] Add OptimizeSubqueries into nonExcludableRules list

2022-06-13 Thread GitBox
HyukjinKwon commented on code in PR #36841: URL: https://github.com/apache/spark/pull/36841#discussion_r896274952 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -4456,6 +4456,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with Ad

[GitHub] [spark] srowen closed pull request #36273: [SPARK-38960][CORE]Spark should fail fast if initial memory too large…

2022-06-13 Thread GitBox
srowen closed pull request #36273: [SPARK-38960][CORE]Spark should fail fast if initial memory too large… URL: https://github.com/apache/spark/pull/36273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36841: [SPARK-39444][SQL] Add OptimizeSubqueries into nonExcludableRules list

2022-06-13 Thread GitBox
HyukjinKwon commented on code in PR #36841: URL: https://github.com/apache/spark/pull/36841#discussion_r896274142 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -4456,6 +4456,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with Ad

[GitHub] [spark] github-actions[bot] closed pull request #35256: [SPARK-37933][SQL] Limit push down for parquet vectorized reader

2022-06-13 Thread GitBox
github-actions[bot] closed pull request #35256: [SPARK-37933][SQL] Limit push down for parquet vectorized reader URL: https://github.com/apache/spark/pull/35256 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] github-actions[bot] closed pull request #35719: [SPARK-38401][SQL][CORE] Unify get preferred locations for shuffle in AQE

2022-06-13 Thread GitBox
github-actions[bot] closed pull request #35719: [SPARK-38401][SQL][CORE] Unify get preferred locations for shuffle in AQE URL: https://github.com/apache/spark/pull/35719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon commented on pull request #36729: [SPARK-39295][PYTHON][DOCS] Improve documentation of pandas API support list

2022-06-13 Thread GitBox
HyukjinKwon commented on PR #36729: URL: https://github.com/apache/spark/pull/36729#issuecomment-1154570503 Sure, thanks for tracking this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [spark] cloud-fan commented on pull request #36837: [SPARK-39441][SQL] Speed up DeduplicateRelations

2022-06-13 Thread GitBox
cloud-fan commented on PR #36837: URL: https://github.com/apache/spark/pull/36837#issuecomment-1154563272 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] cloud-fan closed pull request #36837: [SPARK-39441][SQL] Speed up DeduplicateRelations

2022-06-13 Thread GitBox
cloud-fan closed pull request #36837: [SPARK-39441][SQL] Speed up DeduplicateRelations URL: https://github.com/apache/spark/pull/36837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] dongjoon-hyun commented on pull request #36862: [SPARK-39461][INFRA] Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/(mvn|sbt)`

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36862: URL: https://github.com/apache/spark/pull/36862#issuecomment-1154557430 Hi, @sunchao . Could you review this when you have some time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [spark] dongjoon-hyun closed pull request #36860: [SPARK-39460][CORE][TESTS] Fix `CoarseGrainedSchedulerBackendSuite` to handle fast allocations

2022-06-13 Thread GitBox
dongjoon-hyun closed pull request #36860: [SPARK-39460][CORE][TESTS] Fix `CoarseGrainedSchedulerBackendSuite` to handle fast allocations URL: https://github.com/apache/spark/pull/36860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] dongjoon-hyun commented on pull request #36860: [SPARK-39460][CORE][TESTS] Fix `CoarseGrainedSchedulerBackendSuite` to handle fast allocations

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36860: URL: https://github.com/apache/spark/pull/36860#issuecomment-1154551586 Thank you so much, @huaxingao ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] huaxingao commented on pull request #36860: [SPARK-39460][CORE][TESTS] Fix `CoarseGrainedSchedulerBackendSuite` to handle fast allocations

2022-06-13 Thread GitBox
huaxingao commented on PR #36860: URL: https://github.com/apache/spark/pull/36860#issuecomment-1154541511 LGTM. Thanks for pinging me! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] srowen commented on a diff in pull request #36843: [SPARK-39446][MLLIB] Add relevance score for nDCG evaluation

2022-06-13 Thread GitBox
srowen commented on code in PR #36843: URL: https://github.com/apache/spark/pull/36843#discussion_r896243825 ## mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala: ## @@ -35,8 +35,16 @@ import org.apache.spark.rdd.RDD * @param predictionAndLabels an RD

[GitHub] [spark] allisonwang-db commented on pull request #36837: [SPARK-39441][SQL] Speed up DeduplicateRelations

2022-06-13 Thread GitBox
allisonwang-db commented on PR #36837: URL: https://github.com/apache/spark/pull/36837#issuecomment-1154536909 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] allisonwang-db commented on pull request #36837: [SPARK-39441][SQL] Speed up DeduplicateRelations

2022-06-13 Thread GitBox
allisonwang-db commented on PR #36837: URL: https://github.com/apache/spark/pull/36837#issuecomment-1154536858 @LuciferYang I am running on M1 as well. Indeed the runtime for the TPCDSQuerySuite can vary over the runs. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [spark] srowen closed pull request #35290: [SPARK-37865][SQL][3.0]Fix union bug when the first child of union has duplicate columns

2022-06-13 Thread GitBox
srowen closed pull request #35290: [SPARK-37865][SQL][3.0]Fix union bug when the first child of union has duplicate columns URL: https://github.com/apache/spark/pull/35290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] srowen closed pull request #36334: Refactor tests

2022-06-13 Thread GitBox
srowen closed pull request #36334: Refactor tests URL: https://github.com/apache/spark/pull/36334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: review

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36862: [SPARK-39461][INFRA] Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/{mvn|sbt}`

2022-06-13 Thread GitBox
dongjoon-hyun opened a new pull request, #36862: URL: https://github.com/apache/spark/pull/36862 ### What changes were proposed in this pull request? This PR aims to print `SPARK_LOCAL_(HOSTNAME|IP)` during building and testing at `build/{mvn|sbt}`. ### Why are the changes need

[GitHub] [spark] srowen closed pull request #35586: [SPARK-38265][DOCS][CORE] Update comments of ExecutorAllocationClient

2022-06-13 Thread GitBox
srowen closed pull request #35586: [SPARK-38265][DOCS][CORE] Update comments of ExecutorAllocationClient URL: https://github.com/apache/spark/pull/35586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on pull request #36860: [SPARK-39460][CORE][TESTS] Fix `CoarseGrainedSchedulerBackendSuite` to handle fast allocations

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36860: URL: https://github.com/apache/spark/pull/36860#issuecomment-1154505057 Hi, @huaxingao . Could you review this test PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] dtenedor commented on pull request #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
dtenedor commented on PR #36861: URL: https://github.com/apache/spark/pull/36861#issuecomment-1154503075 Hi @cloud-fan can you take a look at this when you have time, it is a bug fix for the `to_number` and `try_to_number` functions? -- This is an automated message from the Apache Git Ser

[GitHub] [spark] dtenedor opened a new pull request, #36861: [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers

2022-06-13 Thread GitBox
dtenedor opened a new pull request, #36861: URL: https://github.com/apache/spark/pull/36861 ### What changes were proposed in this pull request? Update `to_number` and `try_to_number` functions to allow the `PR` format token with input strings comprising positive numbers. Befor

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36860: [SPARK-39460][CORE][TESTS] Fix CoarseGrainedSchedulerBackendSuite to handle fast allocations

2022-06-13 Thread GitBox
dongjoon-hyun opened a new pull request, #36860: URL: https://github.com/apache/spark/pull/36860 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] amaliujia commented on pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
amaliujia commented on PR #36641: URL: https://github.com/apache/spark/pull/36641#issuecomment-1154465576 @cloud-fan comments addressed and there is one that we can discuss. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] amaliujia commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
amaliujia commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896166516 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -681,4 +681,60 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest w

[GitHub] [spark] amaliujia commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
amaliujia commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896163957 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -287,16 +294,44 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] amaliujia commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
amaliujia commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896161200 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -250,8 +251,14 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] amaliujia commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-13 Thread GitBox
amaliujia commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r896156559 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -250,8 +251,14 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] dongjoon-hyun closed pull request #36858: [SPARK-39458][CORE][TESTS] Fix `UISuite` for IPv6

2022-06-13 Thread GitBox
dongjoon-hyun closed pull request #36858: [SPARK-39458][CORE][TESTS] Fix `UISuite` for IPv6 URL: https://github.com/apache/spark/pull/36858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] dongjoon-hyun commented on pull request #36858: [SPARK-39458][CORE][TESTS] Fix `UISuite` for IPv6

2022-06-13 Thread GitBox
dongjoon-hyun commented on PR #36858: URL: https://github.com/apache/spark/pull/36858#issuecomment-1154394229 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [spark] EnricoMi commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-13 Thread GitBox
EnricoMi commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r893842370 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1382,6 +1417,12 @@ class Analyzer(override val catalogManager: CatalogMana

[GitHub] [spark] polkadot21 opened a new pull request, #36859: DTW: new distance measure for clustering

2022-06-13 Thread GitBox
polkadot21 opened a new pull request, #36859: URL: https://github.com/apache/spark/pull/36859 ### What changes were proposed in this pull request? In this pull request, I propose a new distance measure for clustering, namely, dynamic time warping. ### Why are the chang

  1   2   >