[GitHub] [spark] ivoson commented on pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on PR #39459: URL: https://github.com/apache/spark/pull/39459#issuecomment-1445280811 Hi @mridulm comments addressed. Please take a look when you have time. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028518 ## core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala: ## @@ -2266,6 +2270,160 @@ class BlockManagerSuite extends SparkFunSuite with Matchers with

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028223 ## core/src/main/scala/org/apache/spark/storage/BlockManagerStorageEndpoint.scala: ## @@ -81,6 +81,8 @@ class BlockManagerStorageEndpoint( case

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028179 ## core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala: ## @@ -210,6 +220,65 @@ class BlockManagerMasterEndpoint( case

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028046 ## core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala: ## @@ -525,6 +562,7 @@ private[storage] class BlockInfoManager extends Logging {

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028028 ## core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala: ## @@ -150,6 +150,12 @@ private[storage] class BlockInfoManager extends Logging { */

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118028001 ## core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala: ## @@ -180,6 +186,27 @@ private[storage] class BlockInfoManager extends Logging { //

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118027972 ## core/src/main/scala/org/apache/spark/internal/config/package.scala: ## @@ -2468,4 +2468,15 @@ package object config { .version("3.4.0") .booleanConf

[GitHub] [spark] mridulm commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
mridulm commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118023335 ## core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala: ## @@ -728,7 +800,22 @@ class BlockManagerMasterEndpoint( } if

[GitHub] [spark] srowen commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-25 Thread via GitHub
srowen commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1445261458 Please fix the PR description too https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-25 Thread via GitHub
srowen commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1445261438 This is about SPARK-41391? it also doesn't contain a simple description of what you're reporting, just code snippets. I can work it out, but this could be explained in just a few

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118013777 ## core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala: ## @@ -728,7 +800,22 @@ class BlockManagerMasterEndpoint( } if

[GitHub] [spark] wangyum commented on pull request #40115: [SPARK-42525][SQL] Collapse two adjacent windows with the same partition/order in subquery

2023-02-25 Thread via GitHub
wangyum commented on PR #40115: URL: https://github.com/apache/spark/pull/40115#issuecomment-1445259053 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] wangyum closed pull request #40115: [SPARK-42525][SQL] Collapse two adjacent windows with the same partition/order in subquery

2023-02-25 Thread via GitHub
wangyum closed pull request #40115: [SPARK-42525][SQL] Collapse two adjacent windows with the same partition/order in subquery URL: https://github.com/apache/spark/pull/40115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] hvanhovell opened a new pull request, #40175: [SPARK-42580][CONNECT] Scala client add client side typed APIs

2023-02-25 Thread via GitHub
hvanhovell opened a new pull request, #40175: URL: https://github.com/apache/spark/pull/40175 ### What changes were proposed in this pull request? This PR adds the client side typed API to the Spark Connect Scala Client. ### Why are the changes needed? We want to reach API

[GitHub] [spark] amaliujia closed pull request #40174: [SPARK-42573][CONNECT][FOLLOW-UP] fix broken build after variable rename

2023-02-25 Thread via GitHub
amaliujia closed pull request #40174: [SPARK-42573][CONNECT][FOLLOW-UP] fix broken build after variable rename URL: https://github.com/apache/spark/pull/40174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] wangyum commented on pull request #40093: [SPARK-42500][SQL] ConstantPropagation supports more cases

2023-02-25 Thread via GitHub
wangyum commented on PR #40093: URL: https://github.com/apache/spark/pull/40093#issuecomment-1445252943 TiDB also supports these optimizations: https://user-images.githubusercontent.com/5399861/221389051-80dbc027-fc94-4610-bb72-f7514abf276b.png;>

[GitHub] [spark] hvanhovell closed pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset URL: https://github.com/apache/spark/pull/40173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] hvanhovell commented on pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40173: URL: https://github.com/apache/spark/pull/40173#issuecomment-1445252735 Merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] ivoson commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
ivoson commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1118005755 ## core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala: ## @@ -399,7 +426,14 @@ private[storage] class BlockInfoManager extends Logging { try {

[GitHub] [spark] amaliujia commented on pull request #40174: [SPARK-42573][CONNECT][FOLLOW-UP] fix broken build after variable rename

2023-02-25 Thread via GitHub
amaliujia commented on PR #40174: URL: https://github.com/apache/spark/pull/40174#issuecomment-1445246195 Preferring https://github.com/apache/spark/pull/40173 over this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] amaliujia commented on a diff in pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
amaliujia commented on code in PR #40173: URL: https://github.com/apache/spark/pull/40173#discussion_r1118005633 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -1990,14 +2019,14 @@ class Dataset[T] private[sql] (val sparkSession:

[GitHub] [spark] amaliujia commented on a diff in pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
amaliujia commented on code in PR #40173: URL: https://github.com/apache/spark/pull/40173#discussion_r1118005391 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -1990,14 +2019,14 @@ class Dataset[T] private[sql] (val sparkSession:

[GitHub] [spark] amaliujia commented on pull request #40174: [SPARK-42573][CONNECT][FOLLOW-UP] fix broken build after variable rename

2023-02-25 Thread via GitHub
amaliujia commented on PR #40174: URL: https://github.com/apache/spark/pull/40174#issuecomment-1445245652 @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] amaliujia opened a new pull request, #40174: [SPARK-42573][CONNECT][FOLLOW-UP] fix broken build after variable rename

2023-02-25 Thread via GitHub
amaliujia opened a new pull request, #40174: URL: https://github.com/apache/spark/pull/40174 ### What changes were proposed in this pull request? Fix BUILD caused by https://github.com/apache/spark/pull/40168. ### Why are the changes needed? Fix BUILD ###

[GitHub] [spark] hvanhovell commented on a diff in pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
hvanhovell commented on code in PR #40173: URL: https://github.com/apache/spark/pull/40173#discussion_r1118002760 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -1990,14 +2019,14 @@ class Dataset[T] private[sql] (val sparkSession:

[GitHub] [spark] mridulm commented on a diff in pull request #39459: [SPARK-41497][CORE] Fixing accumulator undercount in the case of the retry task with rdd cache

2023-02-25 Thread via GitHub
mridulm commented on code in PR #39459: URL: https://github.com/apache/spark/pull/39459#discussion_r1117987993 ## core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala: ## @@ -180,6 +186,27 @@ private[storage] class BlockInfoManager extends Logging { //

[GitHub] [spark] github-actions[bot] closed pull request #38464: [SPARK-32628][SQL] Use bloom filter to improve dynamic partition pruning

2023-02-25 Thread via GitHub
github-actions[bot] closed pull request #38464: [SPARK-32628][SQL] Use bloom filter to improve dynamic partition pruning URL: https://github.com/apache/spark/pull/38464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] ritikam2 commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-25 Thread via GitHub
ritikam2 commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1445233793 Sean not sure which issue you were referring to. I updated the why the changes are needed section of the pull request to mirror what Zheng had already put in his pull request. --

[GitHub] [spark] amaliujia commented on pull request #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
amaliujia commented on PR #40173: URL: https://github.com/apache/spark/pull/40173#issuecomment-1445232248 @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] amaliujia opened a new pull request, #40173: [SPARK-42576][CONNECT] Add 2nd groupBy method to Dataset

2023-02-25 Thread via GitHub
amaliujia opened a new pull request, #40173: URL: https://github.com/apache/spark/pull/40173 ### What changes were proposed in this pull request? Add `groupBy(col1: String, cols: String*)` to Scala client Dataset API. ### Why are the changes needed? API coverage

[GitHub] [spark] ritikam2 commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-25 Thread via GitHub
ritikam2 commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1445232031 I have enabled the workflows on the branch. Is there something else that I need to do? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] amaliujia commented on pull request #40172: [SPARK-42569][CONNECT][FOLLOW-UP] Throw unsupported exceptions for persist

2023-02-25 Thread via GitHub
amaliujia commented on PR #40172: URL: https://github.com/apache/spark/pull/40172#issuecomment-1445228075 @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] amaliujia opened a new pull request, #40172: [SPARK-42569][CONNECT][FOLLOW-UP] Throw unsupported exceptions for persist

2023-02-25 Thread via GitHub
amaliujia opened a new pull request, #40172: URL: https://github.com/apache/spark/pull/40172 ### What changes were proposed in this pull request? Follow up https://github.com/apache/spark/pull/40164 to also throw unsupported operation exception for `persist`. Right now we are

[GitHub] [spark] joaoleveiga commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

2023-02-25 Thread via GitHub
joaoleveiga commented on PR #37817: URL: https://github.com/apache/spark/pull/37817#issuecomment-1445218024 > Also merged to 3.3 Thank you so much! Here I was assuming I would pick up this thread on monday but you delivered it  Cheers -- This is an automated message from

[GitHub] [spark] santosh-d3vpl3x commented on pull request #40122: [SPARK-42349][PYTHON] Support pandas cogroup with multiple df

2023-02-25 Thread via GitHub
santosh-d3vpl3x commented on PR #40122: URL: https://github.com/apache/spark/pull/40122#issuecomment-1445208101 > qq, > > > this is a breaking change to this experimental API. > > What's breaking? would be good to keep the PR desc template

[GitHub] [spark] aimtsou commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

2023-02-25 Thread via GitHub
aimtsou commented on PR #37817: URL: https://github.com/apache/spark/pull/37817#issuecomment-1445189813 Thank you @srowen, really appreciated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #40169: [SPARK-42575][Connect][Scala] Make all client tests to extend from ConnectFunSuite

2023-02-25 Thread via GitHub
dongjoon-hyun commented on PR #40169: URL: https://github.com/apache/spark/pull/40169#issuecomment-1445189613 > Merging. FYI - we are planning a refactor of Catalyst soon (post 3.4) and then we will integrate this with Spark's exception and error framework. Thank you for sharing the

[GitHub] [spark] hvanhovell closed pull request #40166: [SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40166: [SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source URL: https://github.com/apache/spark/pull/40166 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] hvanhovell commented on pull request #40166: [SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40166: URL: https://github.com/apache/spark/pull/40166#issuecomment-1445173286 Merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] hvanhovell closed pull request #40143: [SPARK-42538][CONNECT] Make `sql.functions#lit` function support more types

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40143: [SPARK-42538][CONNECT] Make `sql.functions#lit` function support more types URL: https://github.com/apache/spark/pull/40143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] hvanhovell commented on pull request #40143: [SPARK-42538][CONNECT] Make `sql.functions#lit` function support more types

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40143: URL: https://github.com/apache/spark/pull/40143#issuecomment-1445172271 Merging to master/3.4. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] hvanhovell closed pull request #40164: [SPARK-42569][CONNECT] Throw unsupported exceptions for non-supported API

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40164: [SPARK-42569][CONNECT] Throw unsupported exceptions for non-supported API URL: https://github.com/apache/spark/pull/40164 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] hvanhovell commented on pull request #40164: [SPARK-42569][CONNECT] Throw unsupported exceptions for non-supported API

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40164: URL: https://github.com/apache/spark/pull/40164#issuecomment-1445171880 I am merging this one. Can you do persist in a follow-up? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] hvanhovell commented on a diff in pull request #40164: [SPARK-42569][CONNECT] Throw unsupported exceptions for non-supported API

2023-02-25 Thread via GitHub
hvanhovell commented on code in PR #40164: URL: https://github.com/apache/spark/pull/40164#discussion_r1117957143 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -2461,6 +2461,60 @@ class Dataset[T] private[sql] (val session:

[GitHub] [spark] hvanhovell closed pull request #40167: [SPARK-42561][CONNECT] Add temp view API to Dataset

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40167: [SPARK-42561][CONNECT] Add temp view API to Dataset URL: https://github.com/apache/spark/pull/40167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] hvanhovell commented on pull request #40167: [SPARK-42561][CONNECT] Add temp view API to Dataset

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40167: URL: https://github.com/apache/spark/pull/40167#issuecomment-1445171335 Merging. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] hvanhovell closed pull request #40169: [SPARK-42575][Connect][Scala] Make all client tests to extend from ConnectFunSuite

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40169: [SPARK-42575][Connect][Scala] Make all client tests to extend from ConnectFunSuite URL: https://github.com/apache/spark/pull/40169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] hvanhovell commented on pull request #40169: [SPARK-42575][Connect][Scala] Make all client tests to extend from ConnectFunSuite

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40169: URL: https://github.com/apache/spark/pull/40169#issuecomment-1445171047 Merging. FYI - we are planning a refactor of Catalyst soon (post 3.4) and then we will integrate this with Spark's exception and error framework. -- This is an automated message

[GitHub] [spark] hvanhovell closed pull request #40168: [SPARK-42573][Connect][Scala] Enable binary compatibility tests on all major client APIs

2023-02-25 Thread via GitHub
hvanhovell closed pull request #40168: [SPARK-42573][Connect][Scala] Enable binary compatibility tests on all major client APIs URL: https://github.com/apache/spark/pull/40168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] hvanhovell commented on pull request #40168: [SPARK-42573][Connect][Scala] Enable binary compatibility tests on all major client APIs

2023-02-25 Thread via GitHub
hvanhovell commented on PR #40168: URL: https://github.com/apache/spark/pull/40168#issuecomment-1445170286 Merging, thanks for doing this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] bjornjorgensen commented on pull request #40171: [Tests] Refactor TPCH schema to separate file similar to TPCDS for code reuse

2023-02-25 Thread via GitHub
bjornjorgensen commented on PR #40171: URL: https://github.com/apache/spark/pull/40171#issuecomment-1445168808 [priv...@spark.apache.org](mailto:priv...@spark.apache.org) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] srowen commented on pull request #37817: [SPARK-40376][PYTHON] Avoid Numpy deprecation warning

2023-02-25 Thread via GitHub
srowen commented on PR #37817: URL: https://github.com/apache/spark/pull/37817#issuecomment-1445157668 Also merged to 3.3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] Kilo59 commented on pull request #29591: [SPARK-32714][PYTHON] Initial pyspark-stubs port.

2023-02-25 Thread via GitHub
Kilo59 commented on PR #29591: URL: https://github.com/apache/spark/pull/29591#issuecomment-1445147815 Has anyone solved the problem of trying to type-check pyspark code without installing the 200+MB pyspark package? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] srowen commented on pull request #40116: [WIP]SPARK-41391 Fix

2023-02-25 Thread via GitHub
srowen commented on PR #40116: URL: https://github.com/apache/spark/pull/40116#issuecomment-1445132726 Eh, this does not explain the issue at all. Please do so. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] vicennial commented on a diff in pull request #40147: [SPARK-42543][CONNECT] Specify protocol for UDF artifact transfer in JVM/Scala client

2023-02-25 Thread via GitHub
vicennial commented on code in PR #40147: URL: https://github.com/apache/spark/pull/40147#discussion_r1117931152 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -183,6 +183,87 @@ message ExecutePlanResponse { } } +// Request to transfer

[GitHub] [spark] vicennial commented on a diff in pull request #40147: [SPARK-42543][CONNECT] Specify protocol for UDF artifact transfer in JVM/Scala client

2023-02-25 Thread via GitHub
vicennial commented on code in PR #40147: URL: https://github.com/apache/spark/pull/40147#discussion_r1117927851 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -183,6 +183,60 @@ message ExecutePlanResponse { } } +// Request to transfer

[GitHub] [spark] AlanBateman commented on pull request #39909: [SPARK-42369][CORE] Fix constructor for java.nio.DirectByteBuffer

2023-02-25 Thread via GitHub
AlanBateman commented on PR #39909: URL: https://github.com/apache/spark/pull/39909#issuecomment-1445042515 It's the same issue for JDK 8 where it was completely unsupported to be make use of the JDK private constructor. For the JDK side, all we can do it strongly courage the Spark project

[GitHub] [spark] HeartSaVioR closed pull request #40163: [SPARK-42567][SS][SQL] Track load time for state store provider and log warning if it exceeds threshold

2023-02-25 Thread via GitHub
HeartSaVioR closed pull request #40163: [SPARK-42567][SS][SQL] Track load time for state store provider and log warning if it exceeds threshold URL: https://github.com/apache/spark/pull/40163 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] HeartSaVioR commented on pull request #40163: [SPARK-42567][SS][SQL] Track load time for state store provider and log warning if it exceeds threshold

2023-02-25 Thread via GitHub
HeartSaVioR commented on PR #40163: URL: https://github.com/apache/spark/pull/40163#issuecomment-1445041909 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HeartSaVioR closed pull request #40162: [SPARK-42566][SS] RocksDB StateStore lock acquisition should happen after getting input iterator from inputRDD

2023-02-25 Thread via GitHub
HeartSaVioR closed pull request #40162: [SPARK-42566][SS] RocksDB StateStore lock acquisition should happen after getting input iterator from inputRDD URL: https://github.com/apache/spark/pull/40162 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HeartSaVioR commented on pull request #40162: [SPARK-42566][SS] RocksDB StateStore lock acquisition should happen after getting input iterator from inputRDD

2023-02-25 Thread via GitHub
HeartSaVioR commented on PR #40162: URL: https://github.com/apache/spark/pull/40162#issuecomment-1445041550 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] Surbhi-Vijay commented on pull request #40171: [Tests] Refactor TPCH schema to separate file similar to TPCDS for code reuse

2023-02-25 Thread via GitHub
Surbhi-Vijay commented on PR #40171: URL: https://github.com/apache/spark/pull/40171#issuecomment-1445041525 Can someone please help in creating issue for this pull request ? I do not have a ASF Jira account. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] Surbhi-Vijay opened a new pull request, #40171: [Tests] Refactor TPCH schema to separate file similar to TPCDS for code reuse

2023-02-25 Thread via GitHub
Surbhi-Vijay opened a new pull request, #40171: URL: https://github.com/apache/spark/pull/40171 ### What changes were proposed in this pull request? Changes are only in tests files. - Refactored the `TPCHBase` class and created `TPCHSchema` similar to TPCDS. - Created a base class