[GitHub] [spark] sadikovi commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
sadikovi commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1141738160 @beliefer Can you review this PR from JDBC perspective? I think you have contributed extensively to this part of the code. Also, cc @gengliangwang. -- This is an automated message from

[GitHub] [spark] sadikovi commented on a diff in pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
sadikovi commented on code in PR #36726: URL: https://github.com/apache/spark/pull/36726#discussion_r885267600 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -472,6 +481,15 @@ object JdbcUtils extends Logging with SQLConfHelper

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885265232 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -109,6 +116,13 @@ private[spark] class TaskSetManager( private val executorDecommissionK

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885264474 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -80,12 +82,17 @@ private[spark] class TaskSetManager( val copiesRunning = new Array[Int]

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885262650 ## core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala: ## @@ -103,6 +104,9 @@ private[spark] class TaskSchedulerImpl( // of tasks that are very sh

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885252781 ## core/src/main/scala/org/apache/spark/internal/config/package.scala: ## @@ -2073,6 +2073,41 @@ package object config { .timeConf(TimeUnit.MILLISECONDS) .

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885249138 ## core/src/main/scala/org/apache/spark/internal/config/package.scala: ## @@ -2073,6 +2073,41 @@ package object config { .timeConf(TimeUnit.MILLISECONDS) .

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885249138 ## core/src/main/scala/org/apache/spark/internal/config/package.scala: ## @@ -2073,6 +2073,41 @@ package object config { .timeConf(TimeUnit.MILLISECONDS) .

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-05-30 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r885247661 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1289,61 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36732: [SPARK-39345][CORE][SQL][DSTREAM][ML][MESOS][SS] Replace `filter(!condition)` with `filterNot(condition)`

2022-05-30 Thread GitBox
LuciferYang commented on code in PR #36732: URL: https://github.com/apache/spark/pull/36732#discussion_r885245947 ## sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala: ## @@ -413,7 +413,7 @@ class SQLAppStatusListener( if (other != exec)

[GitHub] [spark] LuciferYang opened a new pull request, #36732: [SPARK-39345][CORE][SQL][DSTREAM][MLLIB] Replace `filter(!condition)` with `filterNot(condition)`

2022-05-30 Thread GitBox
LuciferYang opened a new pull request, #36732: URL: https://github.com/apache/spark/pull/36732 ### What changes were proposed in this pull request? This pr replace `filter(!condition)` with `filterNot(condition)` . ### Why are the changes needed? Use appropriate api.

[GitHub] [spark] sunchao commented on pull request #36721: [SPARK-39334][BUILD] Exclude `slf4j-reload4j` from `hadoop-minikdc` test dependency

2022-05-30 Thread GitBox
sunchao commented on PR #36721: URL: https://github.com/apache/spark/pull/36721#issuecomment-1141704120 LGTM too, thanks @LuciferYang ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] sunchao commented on a diff in pull request #36697: [SPARK-39313][SQL] `toCatalystOrdering` should fail if V2Expression can not be translated

2022-05-30 Thread GitBox
sunchao commented on code in PR #36697: URL: https://github.com/apache/spark/pull/36697#discussion_r885234555 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanPartitioning.scala: ## @@ -32,15 +32,15 @@ import org.apache.spark.util.collection.Utils.

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885232931 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala: ## @@ -232,3 +216,33 @@ case class CheckOverflowInSum( override p

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885232378 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -575,7 +707,31 @@ case class Divide( override def symbol: String =

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885231906 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -373,11 +457,24 @@ case class Subtract( override def decimalMetho

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885231224 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -373,11 +457,24 @@ case class Subtract( override def decimalMetho

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885230648 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -521,6 +651,7 @@ trait DivModLike extends BinaryArithmetic {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885229342 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -490,10 +599,31 @@ trait DivModLike extends BinaryArithmetic {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885228623 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -208,6 +210,76 @@ case class Abs(child: Expression, failOnError: Boole

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885227041 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -208,6 +210,76 @@ case class Abs(child: Expression, failOnError: Boole

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885226749 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -208,6 +210,76 @@ case class Abs(child: Expression, failOnError: Boole

[GitHub] [spark] weixiuli commented on a diff in pull request #36724: [SPARK-39338][SQL] Remove dynamic pruning subquery if pruningKey's references is empty

2022-05-30 Thread GitBox
weixiuli commented on code in PR #36724: URL: https://github.com/apache/spark/pull/36724#discussion_r885224233 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/CleanupDynamicPruningFilters.scala: ## @@ -54,7 +54,8 @@ object CleanupDynamicPruningFilters ex

[GitHub] [spark] weixiuli commented on a diff in pull request #36724: [SPARK-39338][SQL] Remove dynamic pruning subquery if pruningKey's references is empty

2022-05-30 Thread GitBox
weixiuli commented on code in PR #36724: URL: https://github.com/apache/spark/pull/36724#discussion_r885224233 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/CleanupDynamicPruningFilters.scala: ## @@ -54,7 +54,8 @@ object CleanupDynamicPruningFilters ex

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885221860 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -208,6 +210,76 @@ case class Abs(child: Expression, failOnError: Boole

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885220637 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -422,9 +520,20 @@ case class Multiply( override def symbol: String

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885220389 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -373,11 +457,24 @@ case class Subtract( override def decimalMetho

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885220020 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -373,11 +457,24 @@ case class Subtract( override def decimalMetho

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885219727 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -323,11 +390,27 @@ case class Add( override def decimalMethod: St

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885218585 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -208,6 +210,76 @@ case class Abs(child: Expression, failOnError: Boole

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885218377 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -244,8 +311,7 @@ abstract class BinaryArithmetic extends BinaryOperato

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885213632 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala: ## @@ -57,10 +53,13 @@ import org.apache.spark.sql.types._ * - LONG get

[GitHub] [spark] AngersZhuuuu opened a new pull request, #36731: [SPARK-39343][SQL] DescribeTableExec should redact properties

2022-05-30 Thread GitBox
AngersZh opened a new pull request, #36731: URL: https://github.com/apache/spark/pull/36731 ### What changes were proposed in this pull request? DescribeTableExec should redact properties ### Why are the changes needed? DescribeTableExec should redact properties

[GitHub] [spark] JoshRosen commented on pull request #36680: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-30 Thread GitBox
JoshRosen commented on PR #36680: URL: https://github.com/apache/spark/pull/36680#issuecomment-1141664634 I'll do a final sign off and merge mid-day tomorrow (Tuesday May 31st), since today is a US holiday. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [spark] AngersZhuuuu opened a new pull request, #36730: [SPARK-39342][SQL] ShowTablePropertiesCommand/ShowTablePropertiesExec should redact properties.

2022-05-30 Thread GitBox
AngersZh opened a new pull request, #36730: URL: https://github.com/apache/spark/pull/36730 ### What changes were proposed in this pull request? ShowTablePropertiesCommand/ShowTablePropertiesExec should redact sensitive properties. ### Why are the changes needed? Show table

[GitHub] [spark] ulysses-you commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
ulysses-you commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885199407 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala: ## @@ -60,7 +56,7 @@ import org.apache.spark.sql.types._ */ // scala

[GitHub] [spark] LuciferYang commented on pull request #36616: [SPARK-39231][SQL] Use `ConstantColumnVector` instead of `On/OffHeapColumnVector` to store partition columns in `VectorizedParquetRecordR

2022-05-30 Thread GitBox
LuciferYang commented on PR #36616: URL: https://github.com/apache/spark/pull/36616#issuecomment-1141663964 cc @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] cloud-fan commented on pull request #36727: [SPARK-39340][SQL][3.2] DS v2 agg pushdown should allow dots in the name of top-level columns

2022-05-30 Thread GitBox
cloud-fan commented on PR #36727: URL: https://github.com/apache/spark/pull/36727#issuecomment-1141660590 Probably pyspark is broken in branch 3.2, cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [spark] pan3793 commented on a diff in pull request #36697: [SPARK-39313][SQL] `toCatalystOrdering` should fail if V2Expression can not be translated

2022-05-30 Thread GitBox
pan3793 commented on code in PR #36697: URL: https://github.com/apache/spark/pull/36697#discussion_r885194978 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanPartitioning.scala: ## @@ -32,15 +32,15 @@ import org.apache.spark.util.collection.Utils.

[GitHub] [spark] wangyum commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
wangyum commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r885194597 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -753,7 +760,10 @@ private[hive] class HiveClientImpl( assert(s.values.fo

[GitHub] [spark] wangyum commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
wangyum commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r885194296 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -518,7 +520,12 @@ private[hive] class HiveClientImpl( createTime = h.getTT

[GitHub] [spark] huaxingao commented on pull request #36644: [SPARK-37523][SQL] Re-optimize partitions in Distribution and Ordering if numPartitions is not specified

2022-05-30 Thread GitBox
huaxingao commented on PR #36644: URL: https://github.com/apache/spark/pull/36644#issuecomment-1141651608 Thank you all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] huaxingao commented on pull request #36727: [SPARK-39340][SQL][3.2] DS v2 agg pushdown should allow dots in the name of top-level columns

2022-05-30 Thread GitBox
huaxingao commented on PR #36727: URL: https://github.com/apache/spark/pull/36727#issuecomment-1141650961 Looks good. I am puzzled why the tests failed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] LuciferYang commented on a diff in pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
LuciferYang commented on code in PR #36726: URL: https://github.com/apache/spark/pull/36726#discussion_r885186409 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -472,6 +481,15 @@ object JdbcUtils extends Logging with SQLConfHelp

[GitHub] [spark] LuciferYang commented on a diff in pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
LuciferYang commented on code in PR #36726: URL: https://github.com/apache/spark/pull/36726#discussion_r885186409 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -472,6 +481,15 @@ object JdbcUtils extends Logging with SQLConfHelp

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885179955 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala: ## @@ -60,7 +56,7 @@ import org.apache.spark.sql.types._ */ // scalast

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885179844 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala: ## @@ -60,7 +56,7 @@ import org.apache.spark.sql.types._ */ // scalast

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r885178644 ## sql/core/src/main/scala/org/apache/spark/sql/Melt.scala: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contrib

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r885178329 ## sql/core/src/main/scala/org/apache/spark/sql/Melt.scala: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contrib

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r885177916 ## sql/core/src/main/scala/org/apache/spark/sql/Melt.scala: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contrib

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r885175634 ## sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -2012,6 +2012,152 @@ class Dataset[T] private[sql]( @scala.annotation.varargs def agg(expr:

[GitHub] [spark] beobest2 commented on pull request #36729: [SPARK-39295][PYTHON][DOCS] Improve documentation of pandas API suppo…

2022-05-30 Thread GitBox
beobest2 commented on PR #36729: URL: https://github.com/apache/spark/pull/36729#issuecomment-1141632078 @HyukjinKwon The current 'supported API generation' function dynamically compares the modules of `PySpark.pandas` and `pandas` to find the difference. At this time, the inherited class i

[GitHub] [spark] zuston commented on pull request #35683: [SPARK-30835][SPARK-39018][CORE][YARN] Add support for YARN decommissioning when ESS is disabled

2022-05-30 Thread GitBox
zuston commented on PR #35683: URL: https://github.com/apache/spark/pull/35683#issuecomment-1141631786 Thanks for @abhishekd0907 to submit this PR. I think this is a good improvement for the Spark jobs' stability, especially in the scenario where we implement elastic Yarn NMs on K8s.

[GitHub] [spark] ulysses-you commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
ulysses-you commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885174282 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -323,11 +389,24 @@ case class Add( override def decimalMethod:

[GitHub] [spark] cloud-fan commented on a diff in pull request #36714: [SPARK-39320][SQL] Support aggregate function `MEDIAN`

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36714: URL: https://github.com/apache/spark/pull/36714#discussion_r885172773 ## sql/core/src/test/resources/sql-tests/inputs/group-by.sql: ## @@ -273,3 +273,16 @@ SELECT FROM aggr GROUP BY k ORDER BY k; + +-- SPARK-39320: Add the MEDIAN() fu

[GitHub] [spark] cloud-fan commented on a diff in pull request #36714: [SPARK-39320][SQL] Support aggregate function `MEDIAN`

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36714: URL: https://github.com/apache/spark/pull/36714#discussion_r885172628 ## sql/core/src/test/resources/sql-tests/inputs/group-by.sql: ## @@ -273,3 +273,16 @@ SELECT FROM aggr GROUP BY k ORDER BY k; + +-- SPARK-39320: Add the MEDIAN() fu

[GitHub] [spark] cloud-fan commented on a diff in pull request #36714: [SPARK-39320][SQL] Support aggregate function `MEDIAN`

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36714: URL: https://github.com/apache/spark/pull/36714#discussion_r885172461 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala: ## @@ -359,6 +359,32 @@ case class Percentile( ) } +// scal

[GitHub] [spark] beobest2 opened a new pull request, #36729: [SPARK-39295][PYTHON][DOCS] Improve documentation of pandas API suppo…

2022-05-30 Thread GitBox
beobest2 opened a new pull request, #36729: URL: https://github.com/apache/spark/pull/36729 ### What changes were proposed in this pull request? The description provided in the supported pandas API list document or the code comment needs improvement. Also, there are ca

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r885169100 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -323,11 +389,24 @@ case class Add( override def decimalMethod: St

[GitHub] [spark] beliefer commented on pull request #36714: [SPARK-39320][SQL] Support aggregate function `MEDIAN`

2022-05-30 Thread GitBox
beliefer commented on PR #36714: URL: https://github.com/apache/spark/pull/36714#issuecomment-1141623396 ping @MaxGekk cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #36660: [SPARK-39284][PS] Implement Groupby.mad

2022-05-30 Thread GitBox
zhengruifeng commented on PR #36660: URL: https://github.com/apache/spark/pull/36660#issuecomment-1141607627 cc @HyukjinKwon I think this PR is ready -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #36722: [SPARK-39335][SQL] DescribeTableCommand should redact properties

2022-05-30 Thread GitBox
cloud-fan commented on PR #36722: URL: https://github.com/apache/spark/pull/36722#issuecomment-1141592045 thanks, merging to master/3.3! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] cloud-fan closed pull request #36722: [SPARK-39335][SQL] DescribeTableCommand should redact properties

2022-05-30 Thread GitBox
cloud-fan closed pull request #36722: [SPARK-39335][SQL] DescribeTableCommand should redact properties URL: https://github.com/apache/spark/pull/36722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon closed pull request #36640: [SPARK-39262][PYTHON] Correct the behavior of creating DataFrame from an RDD

2022-05-30 Thread GitBox
HyukjinKwon closed pull request #36640: [SPARK-39262][PYTHON] Correct the behavior of creating DataFrame from an RDD URL: https://github.com/apache/spark/pull/36640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on pull request #36640: [SPARK-39262][PYTHON] Correct the behavior of creating DataFrame from an RDD

2022-05-30 Thread GitBox
HyukjinKwon commented on PR #36640: URL: https://github.com/apache/spark/pull/36640#issuecomment-1141576890 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] github-actions[bot] closed pull request #35379: [SPARK-38091][SQL] fix bugs in AvroSerializer

2022-05-30 Thread GitBox
github-actions[bot] closed pull request #35379: [SPARK-38091][SQL] fix bugs in AvroSerializer URL: https://github.com/apache/spark/pull/35379 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [spark] sadikovi commented on a diff in pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
sadikovi commented on code in PR #36726: URL: https://github.com/apache/spark/pull/36726#discussion_r885115376 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala: ## @@ -226,6 +226,9 @@ class JDBCOptions( // The prefix that is added t

[GitHub] [spark] AngersZhuuuu commented on pull request #36564: [SPARK-39195][SQL] Spark OutputCommitCoordinator should help keep file consistent with task status.

2022-05-30 Thread GitBox
AngersZh commented on PR #36564: URL: https://github.com/apache/spark/pull/36564#issuecomment-1141543181 ping @cloud-fan Could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] sadikovi commented on pull request #36672: [SPARK-39265][SQL] Support vectorized Parquet scans with DEFAULT values

2022-05-30 Thread GitBox
sadikovi commented on PR #36672: URL: https://github.com/apache/spark/pull/36672#issuecomment-1141516421 Also, can we add a test to check that the DEFAULT values work? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [spark] sadikovi commented on a diff in pull request #36672: [SPARK-39265][SQL] Support vectorized Parquet scans with DEFAULT values

2022-05-30 Thread GitBox
sadikovi commented on code in PR #36672: URL: https://github.com/apache/spark/pull/36672#discussion_r885096588 ## sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java: ## @@ -270,13 +271,40 @@ private void initBatch(

[GitHub] [spark] dongjoon-hyun closed pull request #36728: [SPARK-39341][K8S] KubernetesExecutorBackend should allow IPv6 pod IP

2022-05-30 Thread GitBox
dongjoon-hyun closed pull request #36728: [SPARK-39341][K8S] KubernetesExecutorBackend should allow IPv6 pod IP URL: https://github.com/apache/spark/pull/36728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] williamhyun opened a new pull request, #36728: [SPARK-39341][K8S] KubernetesExecutorBackend should allow IPv6 pod IP

2022-05-30 Thread GitBox
williamhyun opened a new pull request, #36728: URL: https://github.com/apache/spark/pull/36728 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] AmplabJenkins commented on pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-05-30 Thread GitBox
AmplabJenkins commented on PR #36716: URL: https://github.com/apache/spark/pull/36716#issuecomment-1141366692 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #36717: [SPARK-33274][SS] Stop query in cp mode when total cores less than total kafka partition

2022-05-30 Thread GitBox
AmplabJenkins commented on PR #36717: URL: https://github.com/apache/spark/pull/36717#issuecomment-114136 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36723: [SPARK-39337][SQL] Refactor DescribeTableExec to remove duplicate filters

2022-05-30 Thread GitBox
dongjoon-hyun commented on code in PR #36723: URL: https://github.com/apache/spark/pull/36723#discussion_r884993022 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala: ## @@ -147,10 +147,10 @@ class DataSourceV2SQLSuite Array("", "", ""),

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36723: [SPARK-39337][SQL] Refactor DescribeTableExec to remove duplicate filters

2022-05-30 Thread GitBox
dongjoon-hyun commented on code in PR #36723: URL: https://github.com/apache/spark/pull/36723#discussion_r884993022 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala: ## @@ -147,10 +147,10 @@ class DataSourceV2SQLSuite Array("", "", ""),

[GitHub] [spark] cloud-fan commented on pull request #36727: [SPARK-39340][SQL][3.2] DS v2 agg pushdown should allow dots in the name of top-level columns

2022-05-30 Thread GitBox
cloud-fan commented on PR #36727: URL: https://github.com/apache/spark/pull/36727#issuecomment-1141335784 cc @huaxingao @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [spark] cloud-fan opened a new pull request, #36727: [SPARK-39340][SQL][3.2] DS v2 agg pushdown should allow dots in the name of top-level columns

2022-05-30 Thread GitBox
cloud-fan opened a new pull request, #36727: URL: https://github.com/apache/spark/pull/36727 ### What changes were proposed in this pull request? In the first version of DS v2 aggregate pushdown, we don't want to support nested fields and we picked `PushableColumnWithoutNested

[GitHub] [spark] xinrong-databricks commented on a diff in pull request #36640: [SPARK-39262][PYTHON] Correct the behavior of creating DataFrame from an RDD

2022-05-30 Thread GitBox
xinrong-databricks commented on code in PR #36640: URL: https://github.com/apache/spark/pull/36640#discussion_r884941255 ## python/pyspark/sql/session.py: ## @@ -611,8 +611,8 @@ def _inferSchema( :class:`pyspark.sql.types.StructType` """ first = rdd.fi

[GitHub] [spark] cloud-fan commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #32959: URL: https://github.com/apache/spark/pull/32959#discussion_r884916407 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala: ## @@ -518,44 +516,55 @@ object DateTimeUtils { * The return type is [[Optio

[GitHub] [spark] cloud-fan commented on a diff in pull request #36708: [SPARK-37623][SQL] Support ANSI Aggregate Function: regr_intercept

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36708: URL: https://github.com/apache/spark/pull/36708#discussion_r884912975 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/linearRegression.scala: ## @@ -291,3 +291,52 @@ case class RegrSlope(left: Expressio

[GitHub] [spark] cxzl25 commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2022-05-30 Thread GitBox
cxzl25 commented on code in PR #32959: URL: https://github.com/apache/spark/pull/32959#discussion_r884911323 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala: ## @@ -518,44 +516,55 @@ object DateTimeUtils { * The return type is [[Option]]

[GitHub] [spark] cloud-fan commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #32959: URL: https://github.com/apache/spark/pull/32959#discussion_r884900931 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala: ## @@ -518,44 +516,55 @@ object DateTimeUtils { * The return type is [[Optio

[GitHub] [spark] cloud-fan commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #32959: URL: https://github.com/apache/spark/pull/32959#discussion_r884899219 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala: ## @@ -518,44 +516,55 @@ object DateTimeUtils { * The return type is [[Optio

[GitHub] [spark] cxzl25 commented on a diff in pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2022-05-30 Thread GitBox
cxzl25 commented on code in PR #32959: URL: https://github.com/apache/spark/pull/32959#discussion_r884894581 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala: ## @@ -518,44 +516,55 @@ object DateTimeUtils { * The return type is [[Option]]

[GitHub] [spark] cloud-fan commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r884892701 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala: ## @@ -842,10 +843,11 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, had

[GitHub] [spark] cloud-fan commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r884892701 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala: ## @@ -842,10 +843,11 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, had

[GitHub] [spark] cloud-fan commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r884891864 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -753,7 +760,10 @@ private[hive] class HiveClientImpl( assert(s.values.

[GitHub] [spark] cloud-fan commented on a diff in pull request #36625: [SPARK-39203][SQL] Rewrite table location to absolute location based on database location

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36625: URL: https://github.com/apache/spark/pull/36625#discussion_r884890053 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -518,7 +520,12 @@ private[hive] class HiveClientImpl( createTime = h.get

[GitHub] [spark] cloud-fan commented on a diff in pull request #36593: [SPARK-39139][SQL] DS V2 push-down framework supports DS V2 UDF

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36593: URL: https://github.com/apache/spark/pull/36593#discussion_r884887211 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala: ## @@ -304,6 +310,11 @@ abstract class JdbcDialect extends Serializable with Logging{ */

[GitHub] [spark] cloud-fan commented on a diff in pull request #36593: [SPARK-39139][SQL] DS V2 push-down framework supports DS V2 UDF

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36593: URL: https://github.com/apache/spark/pull/36593#discussion_r884880249 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java: ## @@ -241,6 +246,11 @@ protected String visitSQLFunction(String funcNam

[GitHub] [spark] cloud-fan commented on a diff in pull request #36593: [SPARK-39139][SQL] DS V2 push-down framework supports DS V2 UDF

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36593: URL: https://github.com/apache/spark/pull/36593#discussion_r884875139 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/UserDefinedAggregateFunc.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache So

[GitHub] [spark] cloud-fan commented on a diff in pull request #36593: [SPARK-39139][SQL] DS V2 push-down framework supports DS V2 UDF

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36593: URL: https://github.com/apache/spark/pull/36593#discussion_r884874417 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/aggregate/UserDefinedAggregateFunc.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache So

[GitHub] [spark] cloud-fan commented on a diff in pull request #36593: [SPARK-39139][SQL] DS V2 push-down framework supports DS V2 UDF

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36593: URL: https://github.com/apache/spark/pull/36593#discussion_r884872763 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/GeneralScalarExpression.java: ## @@ -235,8 +235,8 @@ public String toString() { try {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r884783994 ## sql/core/src/test/resources/tpcds-plan-stability/approved-plans-modified/q65.sf100/explain.txt: ## @@ -151,106 +152,110 @@ Functions [1]: [avg(revenue#21)] Aggrega

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-05-30 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r884783483 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala: ## @@ -75,18 +80,17 @@ abstract class AverageBase ) protected

[GitHub] [spark] AngersZhuuuu commented on a diff in pull request #36723: [SPARK-39337][SQL] Refactor DescribeTableExec to remove duplicate filters

2022-05-30 Thread GitBox
AngersZh commented on code in PR #36723: URL: https://github.com/apache/spark/pull/36723#discussion_r884771145 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala: ## @@ -147,10 +147,10 @@ class DataSourceV2SQLSuite Array("", "", ""),

[GitHub] [spark] 1104056452 commented on pull request #36447: [SPARK-38807][CORE] Fix the startup error of spark shell on Windows S…

2022-05-30 Thread GitBox
1104056452 commented on PR #36447: URL: https://github.com/apache/spark/pull/36447#issuecomment-1141086939 Can someone help review this patch?thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] 1104056452 commented on pull request #36726: [SPARK-39339][SQL] Support TimestampNTZ type in JDBC data source

2022-05-30 Thread GitBox
1104056452 commented on PR #36726: URL: https://github.com/apache/spark/pull/36726#issuecomment-1141086440 Can someone help review this patch?thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] cloud-fan commented on pull request #27924: [SPARK-31164][SQL] Inconsistent rdd and output partitioning for bucket table when output doesn't contain all bucket columns

2022-05-30 Thread GitBox
cloud-fan commented on PR #27924: URL: https://github.com/apache/spark/pull/27924#issuecomment-1141078095 I'm fine to keep backward compatible with some "by-accident" features if the cost is small. Feel free to open a PR to bring back the old behavior if 1. we do not re-introduce the corr

  1   2   >