Re: [PR] [SPARK-47118][BUILD][CORE][SQL][UI] Migrate from Jetty 10 to Jetty 11 [spark]

2024-05-10 Thread via GitHub


pan3793 commented on PR #45154:
URL: https://github.com/apache/spark/pull/45154#issuecomment-2105548275

   I raised SPARK-48238. And I have no solution without reverting javax => 
jakarta namespace migration yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48210][DOC]Modify the description of whether dynamic partition… [spark]

2024-05-10 Thread via GitHub


guixiaowen commented on PR #46496:
URL: https://github.com/apache/spark/pull/46496#issuecomment-2105515304

   > sorry this description is very hard to read. I think you are trying to say 
that stage level scheduling isn't supported on k8s and yarn when dynamic 
allocation is disabled? When you say dynamic partitions do you mean dynamic 
allocation of executors?
   > 
   > That is not true we added support - 
https://issues.apache.org/jira/browse/SPARK-45495
   > 
   > The description as is:
   > 
   > > * When dynamic allocation is disabled: It allows users to specify 
different task resource requirements at the stage level and will use the same 
executors requested at startup.
   > 
   > Is correct. When dynamic allocation is off, you can specify a different 
resource profile that changes the task requiresments but uses the existing 
executors.
   
   @tgravescs  Thank you for your reply.
   
   Can you help me check my configuration, code, and error messages returned.
   
   Is there a problem with our understanding?
   
   
   set spark.dynamicAllocation.enabled=false
   
   `
   import  org.apache.spark.resource.ResourceProfileBuilder
   import  org.apache.spark.resource.TaskResourceRequests
   
   val rdd = sc.range(0, 9)
   val rpBuilder = new ResourceProfileBuilder()
   val taskReqs = new TaskResourceRequests().cpus(1)
   val rp = rpBuilder.require(taskReqs).build
   rdd.withResources(rp)
   `
   
   Return info:
   
   `
   org.apache.spark.SparkException: TaskResourceProfiles are only supported for 
Standalone cluster for now when dynamic allocation is disabled.
 at 
org.apache.spark.resource.ResourceProfileManager.isSupported(ResourceProfileManager.scala:71)
 at 
org.apache.spark.resource.ResourceProfileManager.addResourceProfile(ResourceProfileManager.scala:126)
 at org.apache.spark.rdd.RDD.withResources(RDD.scala:1829)
 ... 48 elided
   
   `
   According to the document description,  is  '**specify different task 
resource requirements** ' set TaskResourceRequests?
   
   ‘
   When dynamic allocation is disabled: It allows users to specify different 
task resource requirements at the stage level and will use the same executors 
requested at startup.
   ’
   
   Is there a problem with my understanding? 
   Can you help me answer this? Thank you.
   
   
   

   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48172][SQL] Fix escaping issues in JDBC Dialects [spark]

2024-05-10 Thread via GitHub


beliefer commented on PR #46437:
URL: https://github.com/apache/spark/pull/46437#issuecomment-2105511312

   LGTM except @cloud-fan 's comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48172][SQL] Fix escaping issues in JDBC Dialects [spark]

2024-05-10 Thread via GitHub


beliefer commented on code in PR #46437:
URL: https://github.com/apache/spark/pull/46437#discussion_r1597343529


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/util/V2ExpressionSQLBuilder.java:
##
@@ -65,7 +68,6 @@ protected String escapeSpecialCharsForLikePattern(String str) 
{
   switch (c) {
 case '_' -> builder.append("\\_");
 case '%' -> builder.append("\\%");
-case '\'' -> builder.append("\\\'");

Review Comment:
   Got it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [Only Test] test jaxb-runtime [spark]

2024-05-10 Thread via GitHub


panbingkun opened a new pull request, #46533:
URL: https://github.com/apache/spark/pull/46533

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48237][BUILD] Clean up `dev/pr-deps` at the end of `test-dependencies.sh` script [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #46531: [SPARK-48237][BUILD] Clean up 
`dev/pr-deps` at the end of `test-dependencies.sh` script
URL: https://github.com/apache/spark/pull/46531


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [TEST Only] test sbt 1.10.0 performance without local maven repo cache [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46532:
URL: https://github.com/apache/spark/pull/46532#issuecomment-2105497685

   I have narrowed the scope of the impact of not using `local maven repo 
cache` to the step `MIMA test` of the job `lint`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48220][PYTHON] Allow passing PyArrow Table to createDataFrame() [spark]

2024-05-10 Thread via GitHub


ianmcook commented on code in PR #46529:
URL: https://github.com/apache/spark/pull/46529#discussion_r1597327784


##
python/pyspark/sql/pandas/conversion.py:
##
@@ -360,42 +361,52 @@ def createDataFrame(  # type: ignore[misc]
 
 assert isinstance(self, SparkSession)
 

Review Comment:
   The git diff makes it difficult to see what I did here. I moved a few lines 
around at the top of this function and added conditional blocks below:
   
   ```py
   if type(data).__name__ == "Table":
 # new code here to handle PyArrow Table
   else:
 # existing code here to handle PandasDataFrameLike
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [TEST Only] test sbt 1.10.0 performance without local maven repo cache [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46532:
URL: https://github.com/apache/spark/pull/46532#issuecomment-2105493606

   cc @LuciferYang @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48237][BUILD] After executing `test-dependencies.sh`, the dir `dev/pr-deps` should be deleted [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46531:
URL: https://github.com/apache/spark/pull/46531#issuecomment-2105481077

   > Please file a JIRA issue, @panbingkun .
   
   Done, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48155][SQL] AQEPropagateEmptyRelation for join should check if remain child is just BroadcastQueryStageExec [spark]

2024-05-10 Thread via GitHub


AngersZh commented on PR #46523:
URL: https://github.com/apache/spark/pull/46523#issuecomment-2105466110

   > Do you think you can provide test cases for this, @AngersZh ?
   
   Added a new UT to show the difference, pls take a look again @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [MINOR] After executing `test-dependencies.sh`, the dir `dev/pr-deps` should be deleted [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46531:
URL: https://github.com/apache/spark/pull/46531#issuecomment-2105466841

   Please file a JIRA issue, @panbingkun .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46937][SQL] Improve concurrency performance for FunctionRegistry [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #44976:
URL: https://github.com/apache/spark/pull/44976#issuecomment-2105463595

   Thank you, @beliefer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48230][BUILD] Remove unused `jodd-core` [spark]

2024-05-10 Thread via GitHub


pan3793 commented on PR #46520:
URL: https://github.com/apache/spark/pull/46520#issuecomment-2105463104

   @dongjoon-hyun Okay, as this fail the CI, we should revert deps removing 
first and do more investigation later.
   
   While for supporting "legacy Hive UDF jars", I think if the user imports 
some classes that come from Hive transitive deps, Spark is not responsible for 
handling that.
   
   For example, Spark only includes part of Hive deps (I mean the all jars 
shipped by Hive binary tgz), for example, Hive ships `groovy-all-2.4.4.jar` but 
Spark does not, if user's UDF imports classes from `groovy-all-2.4.4.jar`, it 
should fail on Spark due to ClassNotFound, in this case user should add thrid 
party deps by themselves.
   
   I believe jodd/commons-lang 2.x/jackson 1.x are in same position.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [TEST Only] test sbt 1.10.0 performance without local maven repo cache [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46532:
URL: https://github.com/apache/spark/pull/46532#issuecomment-2105461805

   In the `mima` step
   1.with local maven repo cache:
   
   
   
   2.without local maven repo cache:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [TEST Only] test sbt 1.10.0 performance without local maven repo cache [spark]

2024-05-10 Thread via GitHub


panbingkun opened a new pull request, #46532:
URL: https://github.com/apache/spark/pull/46532

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48220][PYTHON] Allow passing PyArrow Table to createDataFrame() [spark]

2024-05-10 Thread via GitHub


ianmcook commented on code in PR #46529:
URL: https://github.com/apache/spark/pull/46529#discussion_r1597327784


##
python/pyspark/sql/pandas/conversion.py:
##
@@ -360,42 +361,52 @@ def createDataFrame(  # type: ignore[misc]
 
 assert isinstance(self, SparkSession)
 

Review Comment:
   The git diff makes it difficult to see what I did here. I moved a few lines 
around at the top of this function and added conditional blocks below:
   
   ```py
   if type(data).__name__ == 'Table':
 # new code here to handle PyArrow Table
   else:
 # existing code here to handle PandasDataFrameLike
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48220][PYTHON] Allow passing PyArrow Table to createDataFrame() [spark]

2024-05-10 Thread via GitHub


ianmcook commented on code in PR #46529:
URL: https://github.com/apache/spark/pull/46529#discussion_r1597327784


##
python/pyspark/sql/pandas/conversion.py:
##
@@ -360,42 +361,52 @@ def createDataFrame(  # type: ignore[misc]
 
 assert isinstance(self, SparkSession)
 

Review Comment:
   The git diff makes it difficult to see what I did here. I just moved a few 
lines around at the top of this function and added conditional blocks below:
   
   ```py
   if type(data).__name__ == 'Table':
 # new code here to handle PyArrow Table
   else:
 # existing code here to handle PandasDataFrameLike
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46937][SQL] Improve concurrency performance for FunctionRegistry [spark]

2024-05-10 Thread via GitHub


beliefer commented on PR #44976:
URL: https://github.com/apache/spark/pull/44976#issuecomment-2105457208

   Let me rebase again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48155][SQL] AQEPropagateEmptyRelation for join should check if remain child is just BroadcastQueryStageExec [spark]

2024-05-10 Thread via GitHub


AngersZh commented on PR #46523:
URL: https://github.com/apache/spark/pull/46523#issuecomment-2105441711

   > Do you think you can provide test cases for this, @AngersZh ?
   
   `SPARK-39551: Invalid plan check - invalid broadcast query stage` Can cover 
this, I don't know if we need to remove `ValidateSparkPlan rule`, it's too 
weird and rough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48220][PYTHON] Allow passing PyArrow Table to createDataFrame() [spark]

2024-05-10 Thread via GitHub


ianmcook commented on code in PR #46529:
URL: https://github.com/apache/spark/pull/46529#discussion_r1597327784


##
python/pyspark/sql/pandas/conversion.py:
##
@@ -360,42 +361,52 @@ def createDataFrame(  # type: ignore[misc]
 
 assert isinstance(self, SparkSession)
 

Review Comment:
   The git diff makes it difficult to see what I did here. I just moved a few 
lines around at the top of this function and added conditional blocks below:
   
   ```py
   if isinstance(data, pa.Table):
 # some new code here
   else:
 # the existing code here
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [MINOR] After executing `test-dependencies.sh`, the dir `dev/pr-deps` should be deleted [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46531:
URL: https://github.com/apache/spark/pull/46531#issuecomment-2105431329

   cc @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


panbingkun commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597322128


##
common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala:
##
@@ -292,6 +301,7 @@ object LogKeys {
   case object LOADED_VERSION extends LogKey
   case object LOAD_FACTOR extends LogKey
   case object LOAD_TIME extends LogKey
+  case object LOCAL_DIRS extends LogKey

Review Comment:
   Hmm ..., the following uses it, do we should merge it to `PATH`?
   https://github.com/apache/spark/assets/15246973/c62dd763-969b-46c3-b4be-8b6eb84d4395;>
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


srielau commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1597322015


##
sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala:
##
@@ -224,6 +224,7 @@ abstract class BaseSessionStateBuilder(
 TableCapabilityCheck +:
 CommandCheck +:
 CollationCheck +:
+ViewSyncSchemaToMetaStore +:

Review Comment:
   There already (?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [MINOR] After executing `test-dependencies.sh`, the dir `dev/pr-deps` should be deleted [spark]

2024-05-10 Thread via GitHub


panbingkun opened a new pull request, #46531:
URL: https://github.com/apache/spark/pull/46531

   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Manually test.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48232][PYTHON][TESTS] Fix 'pyspark.sql.tests.connect.test_connect_session' in Python 3.12 build [spark]

2024-05-10 Thread via GitHub


HyukjinKwon commented on PR #46522:
URL: https://github.com/apache/spark/pull/46522#issuecomment-2105404809

   It's recovered: https://github.com/apache/spark/actions/runs/9036829724 for 
the record.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [WIP] docs: restructure the docs index page [spark]

2024-05-10 Thread via GitHub


github-actions[bot] closed pull request #44812: [WIP] docs: restructure the 
docs index page
URL: https://github.com/apache/spark/pull/44812


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45708][BUILD] Retry mvn deploy [spark]

2024-05-10 Thread via GitHub


github-actions[bot] closed pull request #43559: [SPARK-45708][BUILD] Retry mvn 
deploy
URL: https://github.com/apache/spark/pull/43559


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46108][SQL] keepInnerXmlAsRaw option for Built-in XML Data Source [spark]

2024-05-10 Thread via GitHub


github-actions[bot] closed pull request #44022: [SPARK-46108][SQL] 
keepInnerXmlAsRaw option for Built-in XML Data Source 
URL: https://github.com/apache/spark/pull/44022


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46912] Use worker JAVA_HOME and SPARK_HOME instead of from submitter [spark]

2024-05-10 Thread via GitHub


github-actions[bot] commented on PR #44943:
URL: https://github.com/apache/spark/pull/44943#issuecomment-2105402855

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


gengliangwang commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597304010


##
common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala:
##
@@ -292,6 +301,7 @@ object LogKeys {
   case object LOADED_VERSION extends LogKey
   case object LOAD_FACTOR extends LogKey
   case object LOAD_TIME extends LogKey
+  case object LOCAL_DIRS extends LogKey

Review Comment:
   we don't need this anymore



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48205][SQL][FOLLOWUP] Add missing tags for the dataSource API [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46530:
URL: https://github.com/apache/spark/pull/46530#issuecomment-2105381256

   All compilations passed.
   Merged to master for Apache Spark 4.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48205][SQL][FOLLOWUP] Add missing tags for the dataSource API [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #46530: [SPARK-48205][SQL][FOLLOWUP] Add 
missing tags for the dataSource API
URL: https://github.com/apache/spark/pull/46530


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


panbingkun commented on PR #46493:
URL: https://github.com/apache/spark/pull/46493#issuecomment-2105375831

   > LGTM overall. Thanks for the work!
   
   Updated, thanks! ❤️


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


panbingkun commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597295650


##
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java:
##
@@ -214,7 +219,9 @@ public ManagedBuffer getRddBlockData(
* this method.
*/
   public void applicationRemoved(String appId, boolean cleanupLocalDirs) {
-logger.info("Application {} removed, cleanupLocalDirs = {}", appId, 
cleanupLocalDirs);
+logger.info("Application {} removed, cleanupLocalDirs = {}",
+  MDC.of(LogKeys.APP_ID$.MODULE$, appId),
+  MDC.of(LogKeys.LOCAL_DIRS$.MODULE$, cleanupLocalDirs));

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48205][FOLLOWUP] Add missing tags for the dataSource API [spark]

2024-05-10 Thread via GitHub


allisonwang-db commented on PR #46530:
URL: https://github.com/apache/spark/pull/46530#issuecomment-2105373242

   cc @dongjoon-hyun @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48205][FOLLOWUP] Add missing tags for the dataSource API [spark]

2024-05-10 Thread via GitHub


allisonwang-db opened a new pull request, #46530:
URL: https://github.com/apache/spark/pull/46530

   
   
   ### What changes were proposed in this pull request?
   
   This is a follow-up PR for https://github.com/apache/spark/pull/46487 to add 
missing tags for the `dataSource` API.
   
   
   ### Why are the changes needed?
   
   To address comments from a previous PR.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Existing test
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46528:
URL: https://github.com/apache/spark/pull/46528#issuecomment-2105356804

   Merged to master for Apache Spark 4.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48205][PYTHON] Remove the private[sql] modifier for Python data sources [spark]

2024-05-10 Thread via GitHub


allisonwang-db commented on code in PR #46487:
URL: https://github.com/apache/spark/pull/46487#discussion_r1597284865


##
sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -234,7 +234,7 @@ class SparkSession private(
   /**
* A collection of methods for registering user-defined data sources.
*/
-  private[sql] def dataSource: DataSourceRegistration = 
sessionState.dataSourceRegistration

Review Comment:
   Will do!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #46528: [SPARK-48236][BUILD] Add 
`commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars
URL: https://github.com/apache/spark/pull/46528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48230][BUILD] Remove unused `jodd-core` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46520:
URL: https://github.com/apache/spark/pull/46520#issuecomment-2105347143

   For the existing Hive UDFs which assumes `jodd` library, this could be a 
breaking change like #46528 . 
   ```
   - import jodd.datetime.JDateTime;
   + import org.apache.hadoop.hive.ql.io.parquet.timestamp.datetime.JDateTime;
   ```
   
   Sorry but let me revert this first. We need a more graceful way of removing 
or additional verified migration path (documentation and config).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


gengliangwang commented on PR #46493:
URL: https://github.com/apache/spark/pull/46493#issuecomment-2105340839

   LGTM overall. Thanks for the work!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46528:
URL: https://github.com/apache/spark/pull/46528#issuecomment-2105335928

   Thank you, @viirya !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #46528:
URL: https://github.com/apache/spark/pull/46528#discussion_r1597270600


##
pom.xml:
##
@@ -192,6 +192,8 @@
 1.17.0
 1.26.1
 2.16.1
+
+2.6

Review Comment:
   Thank you. I updated the comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


viirya commented on code in PR #46528:
URL: https://github.com/apache/spark/pull/46528#discussion_r1597269950


##
pom.xml:
##
@@ -192,6 +192,8 @@
 1.17.0
 1.26.1
 2.16.1
+
+2.6

Review Comment:
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48220][PYTHON] Allow passing PyArrow Table to createDataFrame() [spark]

2024-05-10 Thread via GitHub


ianmcook opened a new pull request, #46529:
URL: https://github.com/apache/spark/pull/46529

   ### What changes were proposed in this pull request?
   This PR adds support for passing PyArrow tables to `createDataFrame()`.
   
   ### Why are the changes needed?
   This seems like a logical next step after the addition of a `toArrow()` 
DataFrame method in #45481.
   
   ### Does this PR introduce _any_ user-facing change?
   
   TBD
   
   ### How was this patch tested?
   
   TBD
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48230][BUILD] Remove unused `jodd-core` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46520:
URL: https://github.com/apache/spark/pull/46520#issuecomment-2105328073

   Hi, @pan3793 .
   
   It seems that we need to re-evaluate this dependency removal. Please see the 
following which is related to `commons-lang:commons:lang`.
   
   - #46528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


gengliangwang commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597257739


##
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:
##
@@ -363,7 +367,8 @@ static MergedShuffleFileManager 
newMergedShuffleFileManagerInstance(
   return mergeManagerSubClazz.getConstructor(TransportConf.class, 
File.class)
 .newInstance(conf, mergeManagerFile);
 } catch (Exception e) {
-  defaultLogger.error("Unable to create an instance of {}", 
mergeManagerImplClassName);
+  defaultLogger.error("Unable to create an instance of {}",
+MDC.of(LogKeys.CLASS_NAME$.MODULE$, mergeManagerImplClassName));

Review Comment:
   CLASS_NAME looks good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


gengliangwang commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597250327


##
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java:
##
@@ -214,7 +219,9 @@ public ManagedBuffer getRddBlockData(
* this method.
*/
   public void applicationRemoved(String appId, boolean cleanupLocalDirs) {
-logger.info("Application {} removed, cleanupLocalDirs = {}", appId, 
cleanupLocalDirs);
+logger.info("Application {} removed, cleanupLocalDirs = {}",
+  MDC.of(LogKeys.APP_ID$.MODULE$, appId),
+  MDC.of(LogKeys.LOCAL_DIRS$.MODULE$, cleanupLocalDirs));

Review Comment:
   cleanupLocalDirs is a boolean. let's just use `CLEANUP_LOCAL_DIRS`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46528:
URL: https://github.com/apache/spark/pull/46528#issuecomment-2105298144

   WDYT, cc @pan3793 , @sunchao , @LuciferYang , @yaooqinn , @viirya ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48236][BUILD] Add `commons-lang:commons-lang:2.6` back to support legacy Hive UDF jars [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun opened a new pull request, #46528:
URL: https://github.com/apache/spark/pull/46528

   …
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-10 Thread via GitHub


gengliangwang commented on code in PR #46493:
URL: https://github.com/apache/spark/pull/46493#discussion_r1597243236


##
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java:
##
@@ -368,7 +382,8 @@ public int removeBlocks(String appId, String execId, 
String[] blockIds) {
   if (file.delete()) {
 numRemovedBlocks++;
   } else {
-logger.warn("Failed to delete block: " + file.getAbsolutePath());
+logger.warn("Failed to delete block: {}",

Review Comment:
   yes, path is fine



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46468:
URL: https://github.com/apache/spark/pull/46468#issuecomment-2105258785

   It turns out that Apache Spark is unable to support all legacy Hive UDF jar 
files. Let me make a follow-up.
   - https://github.com/apache/hive/pull/4892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46468:
URL: https://github.com/apache/spark/pull/46468#issuecomment-2105239656

   I locally verified that the failure of `HiveUDFDynamicLoadSuite` is 
consistent.
   ```
   $ build/mvn -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.hive.HiveUDFDynamicLoadSuite test
   - Spark should be able to run Hive UDF using jar regardless of current 
thread context classloader (UDF
   13:33:42.810 WARN org.apache.spark.SparkContext: The JAR 
file:///Users/dongjoon/APACHE/spark-merge/sql/hive/src/test/noclasspath/hive-test-udfs.jar
 at spark://localhost:55031/jars/hive-test-udfs.jar has been added already. 
Overwriting of added jar is not supported in the current version.
   
   *** RUN ABORTED ***
   A needed class was not found. This could be due to an error in your runpath. 
Missing class: org/apache/commons/lang/StringUtils
 java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils
 at 
org.apache.hadoop.hive.contrib.udf.example.GenericUDFTrim2.performOp(GenericUDFTrim2.java:43)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFBaseTrim.evaluate(GenericUDFBaseTrim.java:75)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:170)
 at 
org.apache.spark.sql.hive.HiveGenericUDFEvaluator.returnInspector$lzycompute(hiveUDFEvaluators.scala:118)
 at 
org.apache.spark.sql.hive.HiveGenericUDFEvaluator.returnInspector(hiveUDFEvaluators.scala:117)
 at 
org.apache.spark.sql.hive.HiveGenericUDF.dataType$lzycompute(hiveUDFs.scala:132)
 at org.apache.spark.sql.hive.HiveGenericUDF.dataType(hiveUDFs.scala:132)
 at 
org.apache.spark.sql.hive.HiveUDFExpressionBuilder$.makeHiveFunctionExpression(HiveSessionStateBuilder.scala:184)
 at 
org.apache.spark.sql.hive.HiveUDFExpressionBuilder$.$anonfun$makeExpression$1(HiveSessionStateBuilder.scala:164)
 at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:185)
 ...
 Cause: java.lang.ClassNotFoundException: 
org.apache.commons.lang.StringUtils
 at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
 at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:592)
 at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
 at 
org.apache.hadoop.hive.contrib.udf.example.GenericUDFTrim2.performOp(GenericUDFTrim2.java:43)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFBaseTrim.evaluate(GenericUDFBaseTrim.java:75)
 at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:170)
 at 
org.apache.spark.sql.hive.HiveGenericUDFEvaluator.returnInspector$lzycompute(hiveUDFEvaluators.scala:118)
 at 
org.apache.spark.sql.hive.HiveGenericUDFEvaluator.returnInspector(hiveUDFEvaluators.scala:117)
 at 
org.apache.spark.sql.hive.HiveGenericUDF.dataType$lzycompute(hiveUDFs.scala:132)
 at org.apache.spark.sql.hive.HiveGenericUDF.dataType(hiveUDFs.scala:132)
 ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48206][SQL][TESTS] Add tests for window rewrites with RewriteWithExpression [spark]

2024-05-10 Thread via GitHub


kelvinjian-db commented on PR #46492:
URL: https://github.com/apache/spark/pull/46492#issuecomment-2105225037

   fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48229][SQL] Add collation support for inputFile expressions [spark]

2024-05-10 Thread via GitHub


amaliujia commented on code in PR #46503:
URL: https://github.com/apache/spark/pull/46503#discussion_r1597190775


##
sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala:
##
@@ -1151,6 +1151,23 @@ class CollationSQLExpressionsSuite
 })
   }
 
+  test("Support InputFileName expression with collation") {
+// Supported collations
+Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", 
"UNICODE_CI").foreach(collationName => {

Review Comment:
   Stopping by this PR, maybe this was discussed a while ago, just curious: why 
these possible `collationName` is not listed by a enum?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #46468:
URL: https://github.com/apache/spark/pull/46468#discussion_r1597143409


##
dev/deps/spark-deps-hadoop-3-hive-2.3:
##
@@ -46,7 +46,6 @@ commons-compress/1.26.1//commons-compress-1.26.1.jar
 commons-crypto/1.1.0//commons-crypto-1.1.0.jar
 commons-dbcp/1.4//commons-dbcp-1.4.jar
 commons-io/2.16.1//commons-io-2.16.1.jar
-commons-lang/2.6//commons-lang-2.6.jar

Review Comment:
   This seems to be too hasty or to have a bug.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47118][BUILD][CORE][SQL][UI] Migrate from Jetty 10 to Jetty 11 [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #45154:
URL: https://github.com/apache/spark/pull/45154#issuecomment-2105092801

   Please file a proper JIRA issue, @pan3793 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47118][BUILD][CORE][SQL][UI] Migrate from Jetty 10 to Jetty 11 [spark]

2024-05-10 Thread via GitHub


pan3793 commented on PR #45154:
URL: https://github.com/apache/spark/pull/45154#issuecomment-2105051891

   also cc the 4.0.0-preview1 release manager @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48144][SQL] Fix `canPlanAsBroadcastHashJoin` to respect shuffle join hints [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46401:
URL: https://github.com/apache/spark/pull/46401#issuecomment-2105050202

   Merged to master for Apache Spark 4.0.0.
   Thank you, @fred-db , @agubichev,  @cloud-fan .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48144][SQL] Fix `canPlanAsBroadcastHashJoin` to respect shuffle join hints [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #46401: [SPARK-48144][SQL] Fix 
`canPlanAsBroadcastHashJoin` to respect shuffle join hints
URL: https://github.com/apache/spark/pull/46401


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47118][BUILD][CORE][SQL][UI] Migrate from Jetty 10 to Jetty 11 [spark]

2024-05-10 Thread via GitHub


pan3793 commented on PR #45154:
URL: https://github.com/apache/spark/pull/45154#issuecomment-2105048194

   @HiuKwok @dongjoon-hyun I meet an issue on the latest master branch that 
seems related to this change. The brief reproduction steps are:
   
   1. make dist `dev/make-distribution.sh --tgz -Phive,hive-thriftserver,yarn`
   2. setup yarn/hadoop conf, and perform  `spark-sql --master=yarn`
   
   ```
   2024-05-10 17:36:32 ERROR SparkContext: Error initializing SparkContext.
   org.sparkproject.jetty.util.MultiException: Multiple exceptions
at 
org.sparkproject.jetty.util.MultiException.ifExceptionThrow(MultiException.java:117)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.sparkproject.jetty.servlet.ServletHandler.initialize(ServletHandler.java:751)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.sparkproject.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:392)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.sparkproject.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:902)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.sparkproject.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:306)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.sparkproject.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:93)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.spark.ui.ServerInfo.addHandler(JettyUtils.scala:514) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.ui.SparkUI.$anonfun$attachAllHandlers$2(SparkUI.scala:81) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.ui.SparkUI.$anonfun$attachAllHandlers$2$adapted(SparkUI.scala:81)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) 
~[scala-library-2.13.13.jar:?]
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) 
~[scala-library-2.13.13.jar:?]
at scala.collection.AbstractIterable.foreach(Iterable.scala:935) 
~[scala-library-2.13.13.jar:?]
at 
org.apache.spark.ui.SparkUI.$anonfun$attachAllHandlers$1(SparkUI.scala:81) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.ui.SparkUI.$anonfun$attachAllHandlers$1$adapted(SparkUI.scala:79)
 ~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at scala.Option.foreach(Option.scala:437) ~[scala-library-2.13.13.jar:?]
at org.apache.spark.ui.SparkUI.attachAllHandlers(SparkUI.scala:79) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.SparkContext.$anonfun$new$31(SparkContext.scala:690) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.SparkContext.$anonfun$new$31$adapted(SparkContext.scala:690) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at scala.Option.foreach(Option.scala:437) ~[scala-library-2.13.13.jar:?]
at org.apache.spark.SparkContext.(SparkContext.scala:690) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2963) 
~[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1118)
 ~[spark-sql_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at scala.Option.getOrElse(Option.scala:201) 
[scala-library-2.13.13.jar:?]
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1112) 
[spark-sql_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:64) 
[spark-hive-thriftserver_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.(SparkSQLCLIDriver.scala:405)
 [spark-hive-thriftserver_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:162)
 [spark-hive-thriftserver_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 [spark-hive-thriftserver_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:?]
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
 ~[?:?]
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) 
[spark-core_2.13-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 

Re: [PR] [SPARK-48146][SQL] Fix aggregate function in With expression child assertion [spark]

2024-05-10 Thread via GitHub


kelvinjian-db commented on code in PR #46443:
URL: https://github.com/apache/spark/pull/46443#discussion_r1597045459


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/With.scala:
##
@@ -92,6 +95,21 @@ object With {
 val commonExprRefs = commonExprDefs.map(new CommonExpressionRef(_))
 With(replaced(commonExprRefs), commonExprDefs)
   }
+
+  private[sql] def childContainsUnsupportedAggExpr(withExpr: With): Boolean = {
+lazy val commonExprIds = withExpr.defs.map(_.id).toSet
+withExpr.child.exists {
+  case agg: AggregateExpression =>

Review Comment:
   i agree, or we could add some pruning-based tree traversal functions like 
`existsWithPruning` (similar to `transformWithPruning`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48206][SQL][TESTS] Add tests for window rewrites with RewriteWithExpression [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on PR #46492:
URL: https://github.com/apache/spark/pull/46492#issuecomment-2105006573

   sorry it has conflicts now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1597016630


##
assembly/pom.xml:
##
@@ -75,11 +75,7 @@
   ${project.version}
 
 
-
+

Review Comment:
   Please open a new JIRA and PR to discuss this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1597015882


##
dev/test-dependencies.sh:
##
@@ -49,7 +49,7 @@ OLD_VERSION=$($MVN -q \
 --non-recursive \
 org.codehaus.mojo:exec-maven-plugin:1.6.0:exec | grep -E 
'[0-9]+\.[0-9]+\.[0-9]+')
 # dependency:get for guava and jetty-io are workaround for SPARK-37302.
-GUAVA_VERSION=$(build/mvn help:evaluate -Dexpression=guava.version -q 
-DforceStdout | grep -E "^[0-9.]+$")
+GUAVA_VERSION=$(build/mvn help:evaluate -Dexpression=guava.version -q 
-DforceStdout | grep -E "^[0-9\.]+")

Review Comment:
   Please open a new PR for this, @pan3793 . It's worth to have an independent 
JIRA issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45717][YARN] Avoid use `spark.yarn.user.classpath.first` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #43579:
URL: https://github.com/apache/spark/pull/43579#issuecomment-2104947313

   cc @mridulm , too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45717][YARN] Avoid use `spark.yarn.user.classpath.first` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #43579:
URL: https://github.com/apache/spark/pull/43579#discussion_r1596997363


##
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:
##
@@ -106,7 +106,7 @@ class ClientSuite extends SparkFunSuite
 val conf = new Configuration()
 val sparkConf = new SparkConf()
   .set(SPARK_JARS, Seq(SPARK))
-  .set(USER_CLASS_PATH_FIRST, true)
+  .set(DRIVER_USER_CLASS_PATH_FIRST, true)

Review Comment:
   Please make a new test case for `DRIVER_USER_CLASS_PATH_FIRST`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45717][YARN] Avoid use `spark.yarn.user.classpath.first` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #43579:
URL: https://github.com/apache/spark/pull/43579#discussion_r1596992913


##
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:
##
@@ -1490,7 +1490,7 @@ private[spark] object Client extends Logging {
* Populate the classpath entry in the given environment map.
*
* User jars are generally not added to the JVM's system classpath; those 
are handled by the AM
-   * and executor backend. When the deprecated 
`spark.yarn.user.classpath.first` is used, user jars
+   * and executor backend. When the `spark.driver.userClassPathFirst` is used, 
user jars

Review Comment:
   New comment is correct? It seems that we miss 
`spark.executor.userClassPathFirst` case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


pan3793 commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596994966


##
assembly/pom.xml:
##
@@ -75,11 +75,7 @@
   ${project.version}
 
 
-
+

Review Comment:
   One additional question is should we remove 
`${hadoop.deps.scope}`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


pan3793 commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596992306


##
assembly/pom.xml:
##
@@ -75,11 +75,7 @@
   ${project.version}
 
 
-
+

Review Comment:
   the existing comments are outdated, Hadoop already switched to 
`hadoop-shaded-guava-1.2.0.jar`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-45717][YARN] Avoid use `spark.yarn.user.classpath.first` [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #43579:
URL: https://github.com/apache/spark/pull/43579#discussion_r1596991319


##
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala:
##
@@ -106,7 +106,7 @@ class ClientSuite extends SparkFunSuite
 val conf = new Configuration()
 val sparkConf = new SparkConf()
   .set(SPARK_JARS, Seq(SPARK))
-  .set(USER_CLASS_PATH_FIRST, true)
+  .set(DRIVER_USER_CLASS_PATH_FIRST, true)

Review Comment:
   To @cxzl25, FYI, usually, Apache Spark is reluctant to change a test 
coverage because the deprecated one is still used in the production.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


pan3793 commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596990662


##
pom.xml:
##
@@ -3446,6 +3446,7 @@
 
   org.spark-project.spark:unused
   com.google.guava:guava
+  com.google.guava:failureaccess

Review Comment:
   I refer to the shading rules in the connect module to handle Guava 
dependencies, cc @LuciferYang, would be great if you have an opportunity to 
take a look.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


pan3793 commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596988859


##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala:
##
@@ -341,7 +341,7 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 
 Seq(
   (Period.ofMonths(2), Int.MaxValue) -> "overflow",
-  (Period.ofMonths(Int.MinValue), 10d) -> "not in range",
+  (Period.ofMonths(Int.MinValue), 10d) -> "out of range",

Review Comment:
   Yes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 33.1.0-jre [spark]

2024-05-10 Thread via GitHub


pan3793 commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596986722


##
dev/test-dependencies.sh:
##
@@ -49,7 +49,7 @@ OLD_VERSION=$($MVN -q \
 --non-recursive \
 org.codehaus.mojo:exec-maven-plugin:1.6.0:exec | grep -E 
'[0-9]+\.[0-9]+\.[0-9]+')
 # dependency:get for guava and jetty-io are workaround for SPARK-37302.
-GUAVA_VERSION=$(build/mvn help:evaluate -Dexpression=guava.version -q 
-DforceStdout | grep -E "^[0-9.]+$")
+GUAVA_VERSION=$(build/mvn help:evaluate -Dexpression=guava.version -q 
-DforceStdout | grep -E "^[0-9\.]+")

Review Comment:
   update because the new Guava versions have `-jre` or `-android` suffix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48143][SQL] Use lightweight exceptions for control-flow between UnivocityParser and FailureSafeParser [spark]

2024-05-10 Thread via GitHub


cloud-fan closed pull request #46500: [SPARK-48143][SQL] Use lightweight 
exceptions for control-flow between UnivocityParser and FailureSafeParser
URL: https://github.com/apache/spark/pull/46500


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48143][SQL] Use lightweight exceptions for control-flow between UnivocityParser and FailureSafeParser [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on PR #46500:
URL: https://github.com/apache/spark/pull/46500#issuecomment-2104909079

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46937][SQL] Improve concurrency performance for FunctionRegistry [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #44976:
URL: https://github.com/apache/spark/pull/44976#issuecomment-2104900725

   I resolved the conflicts for you~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48144][SQL] Fix `canPlanAsBroadcastHashJoin` to respect shuffle join hints [spark]

2024-05-10 Thread via GitHub


fred-db commented on PR #46401:
URL: https://github.com/apache/spark/pull/46401#issuecomment-2104893008

   Thanks @dongjoon-hyun ! :) Rebased the code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47716][SQL] Avoid view name conflict in SQLQueryTestSuite semantic sort test case [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #45855:
URL: https://github.com/apache/spark/pull/45855#issuecomment-2104888508

   Hi, @jchen5 and all. How do you want to proceed this PR?
   
   I guess https://github.com/apache/spark/pull/45855#discussion_r1550652914 is 
the latest direction. Could you update the PR in that direction?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48143][SQL] Use lightweight exceptions for control-flow between UnivocityParser and FailureSafeParser [spark]

2024-05-10 Thread via GitHub


vladimirg-db commented on code in PR #46500:
URL: https://github.com/apache/spark/pull/46500#discussion_r1596925973


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/BadRecordException.scala:
##
@@ -67,16 +67,30 @@ case class PartialResultArrayException(
   extends Exception(cause)
 
 /**
- * Exception thrown when the underlying parser meet a bad record and can't 
parse it.
+ * Exception thrown when the underlying parser met a bad record and can't 
parse it.
  * @param record a function to return the record that cause the parser to fail
  * @param partialResults a function that returns an row array, which is the 
partial results of
  *  parsing this bad record.
- * @param cause the actual exception about why the record is bad and can't be 
parsed.
+ * @param cause the actual exception about why the record is bad and can't be 
parsed. It's better
+ *  to use `LazyBadRecordCauseWrapper` here to delay heavy 
cause construction
+ *  until it's needed.
  */
 case class BadRecordException(
 @transient record: () => UTF8String,
 @transient partialResults: () => Array[InternalRow] = () => 
Array.empty[InternalRow],
-cause: Throwable) extends Exception(cause)
+cause: Throwable) extends Exception(cause) {
+  override def getStackTrace(): Array[StackTraceElement] = new 
Array[StackTraceElement](0)

Review Comment:
   Added a doc for that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47424][SQL] Add getDatabaseCalendar method to the JdbcDialect [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #45537:
URL: https://github.com/apache/spark/pull/45537#discussion_r1596901652


##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3992,6 +3992,16 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
+  val USE_LOCAL_SESSION_CALENDAR = 
buildConf("spark.sql.jdbc.useLocalSessionCalendar")

Review Comment:
   let's turn it into a legacy config
   ```
   spark.sql.legacy.jdbc.useNullCalendar
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46937][SQL] Improve concurrency performance for FunctionRegistry [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #44976:
URL: https://github.com/apache/spark/pull/44976#issuecomment-2104807458

   Could you resolve the conflicts, @beliefer ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47424][SQL] Add getDatabaseCalendar method to the JdbcDialect [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #45537:
URL: https://github.com/apache/spark/pull/45537#discussion_r1596899527


##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala:
##
@@ -499,8 +500,14 @@ object JdbcUtils extends Logging with SQLConfHelper {
   }
 
 case TimestampType =>
+  val calendar = if (conf.useLocalSessionCalendar) {
+dialect.getDatabaseCalendar.get

Review Comment:
   ```suggestion
   dialect.getDatabaseCalendar.orNull
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47424][SQL] Add getDatabaseCalendar method to the JdbcDialect [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #45537:
URL: https://github.com/apache/spark/pull/45537#discussion_r1596898967


##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3992,6 +3992,16 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
+  val USE_LOCAL_SESSION_CALENDAR = 
buildConf("spark.sql.jdbc.useLocalSessionCalendar")
+.doc("When retrieving timestamp NTZ columns from JDBC data source, user 
will see the " +
+  "values as they where shown in data source only if the JVM time zone is 
equal " +
+  "to the spark local session time zone. Since these two can differ, if 
this " +
+  "configuration property is set to true, rs.getTimestamp and 
rs.setTimestamp " +
+  "API calls will use calendar with spark local session time zone as the 
second argument")
+.version("4.0.0")
+.booleanConf

Review Comment:
   let's add `.internal`. Users should not set this. It's only for unknown bugs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47424][SQL] Add getDatabaseCalendar method to the JdbcDialect [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #45537:
URL: https://github.com/apache/spark/pull/45537#discussion_r1596897736


##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3992,6 +3992,16 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
+  val USE_LOCAL_SESSION_CALENDAR = 
buildConf("spark.sql.jdbc.useLocalSessionCalendar")
+.doc("When retrieving timestamp NTZ columns from JDBC data source, user 
will see the " +

Review Comment:
   ```suggestion
   .doc("When read timestamp NTZ columns from JDBC data source as Spark 
timestamp LTZ, user will see the " +
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47415][SQL] Collation support: Levenshtein [spark]

2024-05-10 Thread via GitHub


nikolamand-db commented on code in PR #45963:
URL: https://github.com/apache/spark/pull/45963#discussion_r1596897091


##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##
@@ -722,6 +722,65 @@ public static UTF8String execLowercase(
 }
   }
 
+  /**
+   * Utility class for collation aware Levenshtein function.
+   */
+  public static class Levenshtein{
+
+/**
+ * Implementation of SubstringEquals interface for collation aware 
comparison of two substrings.
+ */
+private static class CollationSubstringEquals implements 
UTF8String.SubstringEquals {
+  private final int collationId;
+  private final UTF8String left, right;
+
+  CollationSubstringEquals(int collationId) {
+this.collationId = collationId;
+this.left = new UTF8String();
+this.right = new UTF8String();
+  }
+
+  @Override
+  public boolean equals(UTF8String left, UTF8String right, int posLeft, 
int posRight,
+int lenLeft, int lenRight) {
+this.left.moveAddress(left, posLeft, lenLeft);
+this.right.moveAddress(right, posRight, lenRight);
+return CollationFactory.fetchCollation(collationId).equalsFunction
+.apply(this.left, this.right);
+  }
+}
+
+public static Integer exec(final UTF8String left, final UTF8String right, 
final int collationId){
+  CollationFactory.Collation collation = 
CollationFactory.fetchCollation(collationId);
+
+  if (collation.supportsBinaryEquality){
+return left.levenshteinDistance(right);
+  }
+  else{
+return left.levenshteinDistance(right, new 
CollationSubstringEquals(collationId));
+  }
+}
+
+public static Integer execWithThreshold(final UTF8String left, final 
UTF8String right, final int threshold, final int collationId){
+  CollationFactory.Collation collation = 
CollationFactory.fetchCollation(collationId);
+
+  if (collation.supportsBinaryEquality){
+return left.levenshteinDistance(right, threshold);
+  }
+  else{
+return left.levenshteinDistance(right, threshold, new 
CollationSubstringEquals(collationId));
+  }

Review Comment:
   This should be done in other places as well (both threshold & non-threshold 
Levenshtein).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47424][SQL] Add getDatabaseCalendar method to the JdbcDialect [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #45537:
URL: https://github.com/apache/spark/pull/45537#discussion_r1596897400


##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3992,6 +3992,16 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
+  val USE_LOCAL_SESSION_CALENDAR = 
buildConf("spark.sql.jdbc.useLocalSessionCalendar")
+.doc("When retrieving timestamp NTZ columns from JDBC data source, user 
will see the " +
+  "values as they where shown in data source only if the JVM time zone is 
equal " +

Review Comment:
   ```suggestion
 "values as they were shown in data source only if the JVM time zone is 
equal " +
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47793][TEST][FOLLOWUP] Fix flaky test for Python data source exactly once. [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #46481: [SPARK-47793][TEST][FOLLOWUP] Fix 
flaky test for Python data source exactly once.
URL: https://github.com/apache/spark/pull/46481


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47415][SQL] Collation support: Levenshtein [spark]

2024-05-10 Thread via GitHub


nikolamand-db commented on code in PR #45963:
URL: https://github.com/apache/spark/pull/45963#discussion_r1596887611


##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##
@@ -722,6 +722,65 @@ public static UTF8String execLowercase(
 }
   }
 
+  /**
+   * Utility class for collation aware Levenshtein function.
+   */
+  public static class Levenshtein{

Review Comment:
   ```suggestion
 public static class Levenshtein {
   ```



##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##
@@ -722,6 +722,65 @@ public static UTF8String execLowercase(
 }
   }
 
+  /**
+   * Utility class for collation aware Levenshtein function.
+   */
+  public static class Levenshtein{
+
+/**
+ * Implementation of SubstringEquals interface for collation aware 
comparison of two substrings.
+ */
+private static class CollationSubstringEquals implements 
UTF8String.SubstringEquals {
+  private final int collationId;
+  private final UTF8String left, right;
+
+  CollationSubstringEquals(int collationId) {
+this.collationId = collationId;
+this.left = new UTF8String();
+this.right = new UTF8String();
+  }
+
+  @Override
+  public boolean equals(UTF8String left, UTF8String right, int posLeft, 
int posRight,
+int lenLeft, int lenRight) {
+this.left.moveAddress(left, posLeft, lenLeft);
+this.right.moveAddress(right, posRight, lenRight);
+return CollationFactory.fetchCollation(collationId).equalsFunction
+.apply(this.left, this.right);
+  }
+}
+
+public static Integer exec(final UTF8String left, final UTF8String right, 
final int collationId){
+  CollationFactory.Collation collation = 
CollationFactory.fetchCollation(collationId);
+
+  if (collation.supportsBinaryEquality){
+return left.levenshteinDistance(right);
+  }
+  else{

Review Comment:
   ```suggestion
 else {
   ```



##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##
@@ -722,6 +722,65 @@ public static UTF8String execLowercase(
 }
   }
 
+  /**
+   * Utility class for collation aware Levenshtein function.
+   */
+  public static class Levenshtein{
+
+/**
+ * Implementation of SubstringEquals interface for collation aware 
comparison of two substrings.
+ */
+private static class CollationSubstringEquals implements 
UTF8String.SubstringEquals {
+  private final int collationId;
+  private final UTF8String left, right;
+
+  CollationSubstringEquals(int collationId) {
+this.collationId = collationId;
+this.left = new UTF8String();
+this.right = new UTF8String();
+  }
+
+  @Override
+  public boolean equals(UTF8String left, UTF8String right, int posLeft, 
int posRight,
+int lenLeft, int lenRight) {
+this.left.moveAddress(left, posLeft, lenLeft);
+this.right.moveAddress(right, posRight, lenRight);
+return CollationFactory.fetchCollation(collationId).equalsFunction
+.apply(this.left, this.right);
+  }
+}
+
+public static Integer exec(final UTF8String left, final UTF8String right, 
final int collationId){
+  CollationFactory.Collation collation = 
CollationFactory.fetchCollation(collationId);
+
+  if (collation.supportsBinaryEquality){

Review Comment:
   ```suggestion
 if (collation.supportsBinaryEquality) {
   ```



##
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##
@@ -722,6 +722,65 @@ public static UTF8String execLowercase(
 }
   }
 
+  /**
+   * Utility class for collation aware Levenshtein function.
+   */
+  public static class Levenshtein{
+
+/**
+ * Implementation of SubstringEquals interface for collation aware 
comparison of two substrings.
+ */
+private static class CollationSubstringEquals implements 
UTF8String.SubstringEquals {
+  private final int collationId;
+  private final UTF8String left, right;
+
+  CollationSubstringEquals(int collationId) {
+this.collationId = collationId;
+this.left = new UTF8String();
+this.right = new UTF8String();
+  }
+
+  @Override
+  public boolean equals(UTF8String left, UTF8String right, int posLeft, 
int posRight,
+int lenLeft, int lenRight) {
+this.left.moveAddress(left, posLeft, lenLeft);
+this.right.moveAddress(right, posRight, lenRight);
+return CollationFactory.fetchCollation(collationId).equalsFunction
+.apply(this.left, this.right);
+  }
+}
+
+public static Integer exec(final UTF8String left, final UTF8String right, 
final int collationId){
+  CollationFactory.Collation collation = 

Re: [PR] [SPARK-47793][TEST][FOLLOWUP] Fix flaky test for Python data source exactly once. [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #46481:
URL: https://github.com/apache/spark/pull/46481#issuecomment-2104801936

   Let me bring this first for further monitoring. Thank you, @chaoqin-li1123 
and @allisonwang-db .
   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun closed pull request #45565: [SPARK-47441][YARN] Do not add log 
link for unmanaged AM in Spark UI
URL: https://github.com/apache/spark/pull/45565


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #45565:
URL: https://github.com/apache/spark/pull/45565#issuecomment-2104797350

   Thank you, @tgravescs !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1596850227


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala:
##
@@ -35,6 +35,56 @@ object EliminateView extends Rule[LogicalPlan] with 
CastSupport {
   }
 }
 
+/**
+ * ViewBindingMode is used to specify the expected schema binding mode when we 
want to create or
+ * replace a view in [[CreateViewStatement]].
+ */
+sealed trait ViewSchemaMode {
+  override val toString: String = getClass.getSimpleName.stripSuffix("$")
+}
+
+/**
+ * SchemaBinding means the view only tolerates minimal changes to the 
underlying schema.
+ * It can tolerate extra columns in SELECT * and upcast to more generic types.
+ */
+object SchemaBinding extends ViewSchemaMode {
+  override val toString: String = "BINDING"

Review Comment:
   ```suggestion
 override def toString: String = "BINDING"
   ```
   since the function body is just a string literal



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1596873275


##
sql/core/src/test/resources/sql-tests/results/view-schema-binding-config.sql.out:
##
@@ -0,0 +1,794 @@
+-- Automatically generated by SQLQueryTestSuite
+-- !query
+SET spark.sql.viewSchemaBindingMode
+-- !query schema
+struct
+-- !query output
+spark.sql.viewSchemaBindingModeCOMPENSATION
+
+
+-- !query
+SET spark.sql.viewSchemaBindingMode = EVOLUTION
+-- !query schema
+struct<>
+-- !query output
+java.lang.IllegalArgumentException
+The value of spark.sql.viewSchemaBindingMode should be one of COMPENSATION, 
DISABLED, but was EVOLUTION
+
+
+-- !query
+SET spark.sql.viewSchemaBindingMode = DISABLED
+-- !query schema
+struct
+-- !query output
+spark.sql.viewSchemaBindingModeDISABLED
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA BINDING AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 44,
+"fragment" : "WITH SCHEMA BINDING"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA COMPENSATION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 49,
+"fragment" : "WITH SCHEMA COMPENSATION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA TYPE EVOLUTION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 51,
+"fragment" : "WITH SCHEMA TYPE EVOLUTION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA EVOLUTION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 46,
+"fragment" : "WITH SCHEMA EVOLUTION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+DESCRIBE EXTENDED v
+-- !query schema
+struct
+-- !query output
+1  int 
+   
+# Detailed Table Information   
+Catalogspark_catalog   
+Database   default 
+Table  v   
+Created Time [not included in comparison]
+Last Access [not included in comparison]
+Created By [not included in comparison]
+Type   VIEW
+View Text  SELECT 1
+View Original Text SELECT 1
+View Catalog and Namespace spark_catalog.default   
+View Query Output Columns  [1]
+
+
+-- !query
+SHOW TABLE EXTENDED LIKE 'v'

Review Comment:
   what extra information we want to get from `SHOW TABLE EXTENDED`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1596873275


##
sql/core/src/test/resources/sql-tests/results/view-schema-binding-config.sql.out:
##
@@ -0,0 +1,794 @@
+-- Automatically generated by SQLQueryTestSuite
+-- !query
+SET spark.sql.viewSchemaBindingMode
+-- !query schema
+struct
+-- !query output
+spark.sql.viewSchemaBindingModeCOMPENSATION
+
+
+-- !query
+SET spark.sql.viewSchemaBindingMode = EVOLUTION
+-- !query schema
+struct<>
+-- !query output
+java.lang.IllegalArgumentException
+The value of spark.sql.viewSchemaBindingMode should be one of COMPENSATION, 
DISABLED, but was EVOLUTION
+
+
+-- !query
+SET spark.sql.viewSchemaBindingMode = DISABLED
+-- !query schema
+struct
+-- !query output
+spark.sql.viewSchemaBindingModeDISABLED
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA BINDING AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 44,
+"fragment" : "WITH SCHEMA BINDING"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA COMPENSATION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 49,
+"fragment" : "WITH SCHEMA COMPENSATION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA TYPE EVOLUTION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 51,
+"fragment" : "WITH SCHEMA TYPE EVOLUTION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v WITH SCHEMA EVOLUTION AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.parser.ParseException
+{
+  "errorClass" : "FEATURE_NOT_ENABLED",
+  "sqlState" : "56038",
+  "messageParameters" : {
+"configKey" : "spark.sql.viewSchemaBindingMode",
+"configValue" : "COMPENSATION",
+"featureName" : "VIEW ... WITH SCHEMA ..."
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 26,
+"stopIndex" : 46,
+"fragment" : "WITH SCHEMA EVOLUTION"
+  } ]
+}
+
+
+-- !query
+CREATE OR REPLACE VIEW v AS SELECT 1
+-- !query schema
+struct<>
+-- !query output
+
+
+
+-- !query
+DESCRIBE EXTENDED v
+-- !query schema
+struct
+-- !query output
+1  int 
+   
+# Detailed Table Information   
+Catalogspark_catalog   
+Database   default 
+Table  v   
+Created Time [not included in comparison]
+Last Access [not included in comparison]
+Created By [not included in comparison]
+Type   VIEW
+View Text  SELECT 1
+View Original Text SELECT 1
+View Catalog and Namespace spark_catalog.default   
+View Query Output Columns  [1]
+
+
+-- !query
+SHOW TABLE EXTENDED LIKE 'v'

Review Comment:
   what extra information we want to get from `SHOW TABLE EXTENDED`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1596870581


##
sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala:
##
@@ -224,6 +224,7 @@ abstract class BaseSessionStateBuilder(
 TableCapabilityCheck +:
 CommandCheck +:
 CollationCheck +:
+ViewSyncSchemaToMetaStore +:

Review Comment:
   we should put this rule in `HiveSessionStateBuilder` as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-10 Thread via GitHub


cloud-fan commented on code in PR #46267:
URL: https://github.com/apache/spark/pull/46267#discussion_r1596869407


##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala:
##
@@ -620,3 +621,65 @@ object CollationCheck extends (LogicalPlan => Unit) {
   private def isCollationExpression(expression: Expression): Boolean =
 expression.isInstanceOf[Collation] || expression.isInstanceOf[Collate]
 }
+
+
+/**
+ * This rule checks for references to views WITH SCHEMA [TYPE] EVOLUTION and 
synchronizes the
+ * catalog if evolution was detected.
+ * It does so by walking the resolved plan looking for View operators for 
persisted views.
+ */
+object ViewSyncSchemaToMetaStore extends (LogicalPlan => Unit) {
+  def apply(plan: LogicalPlan): Unit = {
+plan.foreach {
+  case View(metaData, false, viewQuery)
+if (metaData.viewSchemaMode == SchemaTypeEvolution ||
+  metaData.viewSchemaMode == SchemaEvolution) =>
+val viewSchemaMode = metaData.viewSchemaMode
+val viewFields = metaData.schema.fields
+val viewQueryFields = viewQuery.schema.fields
+val session = SparkSession.getActiveSession.get
+val redoSignature =
+  viewSchemaMode == SchemaEvolution && viewFields.length != 
viewQueryFields.length
+val fieldNames = viewQuery.schema.fieldNames
+val newProperties = if (redoSignature) {
+  generateViewProperties(
+metaData.properties,
+session,
+fieldNames,
+fieldNames,
+metaData.viewSchemaMode)
+} else {
+  metaData.properties
+}
+
+val redo = redoSignature || viewFields.zipWithIndex.exists { case 
(field, index) =>
+  val planField = viewQueryFields(index)
+  (field.dataType != planField.dataType ||
+field.nullable != planField.nullable ||
+(viewSchemaMode == SchemaEvolution && (

Review Comment:
   if we hit this condition, shall we also redo signature?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 32+ [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on code in PR #42493:
URL: https://github.com/apache/spark/pull/42493#discussion_r1596869988


##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala:
##
@@ -341,7 +341,7 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 
 Seq(
   (Period.ofMonths(2), Int.MaxValue) -> "overflow",
-  (Period.ofMonths(Int.MinValue), 10d) -> "not in range",
+  (Period.ofMonths(Int.MinValue), 10d) -> "out of range",

Review Comment:
   So, this error message is originated from Guava?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44811][BUILD] Upgrade Guava to 32+ [spark]

2024-05-10 Thread via GitHub


dongjoon-hyun commented on PR #42493:
URL: https://github.com/apache/spark/pull/42493#issuecomment-2104762964

   Thank you for updating this, @pan3793 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >