dongjoon-hyun commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095144999
Please note that this is also related to the following INFRA blocker issue.
- SPARK-48094 Reduce GitHub Action usage according to ASF project allowance
--
This is an automated
chaoqin-li1123 commented on code in PR #46139:
URL: https://github.com/apache/spark/pull/46139#discussion_r1590520510
##
python/docs/source/user_guide/sql/python_data_source.rst:
##
@@ -84,6 +93,131 @@ Define the reader logic to generate synthetic data. Use the
`faker` library
chaoqin-li1123 commented on code in PR #46139:
URL: https://github.com/apache/spark/pull/46139#discussion_r1590520296
##
python/docs/source/user_guide/sql/python_data_source.rst:
##
@@ -137,3 +271,21 @@ Use the fake datasource with a different number of rows:
# | Caitlin
dongjoon-hyun commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095200623
Could you review this PR, @viirya ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun opened a new pull request, #46395:
URL: https://github.com/apache/spark/pull/46395
…
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
chaoqin-li1123 commented on code in PR #46139:
URL: https://github.com/apache/spark/pull/46139#discussion_r1590521175
##
python/docs/source/user_guide/sql/python_data_source.rst:
##
@@ -59,8 +59,17 @@ Start by creating a new subclass of :class:`DataSource`.
Define the source
chaoqin-li1123 commented on code in PR #46139:
URL: https://github.com/apache/spark/pull/46139#discussion_r1590520752
##
python/docs/source/user_guide/sql/python_data_source.rst:
##
@@ -59,8 +59,17 @@ Start by creating a new subclass of :class:`DataSource`.
Define the source
panbingkun commented on PR #46288:
URL: https://github.com/apache/spark/pull/46288#issuecomment-2095042166
> Do we have a corresponding issue in the Scala community or JLine community
(or GitHub issue)?
>
> IMO, a. We can ignore Scala 2.13.14 completely due to this issue. b. We
can
chaoqin-li1123 commented on code in PR #46139:
URL: https://github.com/apache/spark/pull/46139#discussion_r1590525766
##
python/docs/source/user_guide/sql/python_data_source.rst:
##
@@ -84,6 +93,131 @@ Define the reader logic to generate synthetic data. Use the
`faker` library
dongjoon-hyun commented on PR #46395:
URL: https://github.com/apache/spark/pull/46395#issuecomment-2095237705
Could you review this PR too, @LuciferYang ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dongjoon-hyun commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095237347
Thank you so much, @LuciferYang !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
panbingkun commented on PR #46376:
URL: https://github.com/apache/spark/pull/46376#issuecomment-2095048623
late LGTM.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
panbingkun commented on code in PR #46022:
URL: https://github.com/apache/spark/pull/46022#discussion_r1590471939
##
connector/profiler/src/main/scala/org/apache/spark/executor/profiler/ExecutorProfilerPlugin.scala:
##
@@ -23,7 +23,8 @@ import scala.util.Random
import
dongjoon-hyun commented on PR #46395:
URL: https://github.com/apache/spark/pull/46395#issuecomment-2095112747
WDYT, @HyukjinKwon ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
dongjoon-hyun commented on PR #46395:
URL: https://github.com/apache/spark/pull/46395#issuecomment-2095122806
Alternatively, we can ignore `YarnClusterSuite` completely. However, I'm not
sure who is going to take on that JIRA issue in the future. So, I choose this
path.
--
This is an
zeotuan commented on code in PR #46362:
URL: https://github.com/apache/spark/pull/46362#discussion_r1590481881
##
common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala:
##
@@ -26,13 +26,16 @@ trait LogKey {
}
/**
- * Various keys used for mapped diagnostic
dongjoon-hyun opened a new pull request, #46396:
URL: https://github.com/apache/spark/pull/46396
…
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
dongjoon-hyun commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095142285
cc @juliuszsompolski , @grundprinzip , @HyukjinKwon , @LuciferYang from
SPARK-44422
- #42009
--
This is an automated message from the Apache Git Service.
To respond to the
dongjoon-hyun commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095142646
I hope we can fix the root cause in SPARK-48139 .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095178573
@HyukjinKwon both test_python_datasource, test_python_streaming_datasource
will fail with the same error if py4j*.zip is removed.
> Traceback (most recent call last):
>
huanliwang-db commented on PR #46351:
URL: https://github.com/apache/spark/pull/46351#issuecomment-2095178435
> Ideally all non-EOL version lines, 3.5/3.4.
@HeartSaVioR how can i do the backport?
--
This is an automated message from the Apache Git Service.
To respond to the
zeotuan commented on PR #46362:
URL: https://github.com/apache/spark/pull/46362#issuecomment-2095094081
@gengliangwang Hi please help review
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
zeotuan commented on code in PR #46362:
URL: https://github.com/apache/spark/pull/46362#discussion_r1590481881
##
common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala:
##
@@ -26,13 +26,16 @@ trait LogKey {
}
/**
- * Various keys used for mapped diagnostic
LuciferYang commented on PR #46396:
URL: https://github.com/apache/spark/pull/46396#issuecomment-2095236411
Merged into master for Spark 4.0. Thanks @dongjoon-hyun ~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
LuciferYang closed pull request #46396: [SPARK-48138][CONNECT][TESTS] Disable a
flaky `SparkSessionE2ESuite.interrupt tag` test
URL: https://github.com/apache/spark/pull/46396
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
chenhao-db commented on code in PR #46338:
URL: https://github.com/apache/spark/pull/46338#discussion_r1590327198
##
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##
@@ -392,21 +392,32 @@ public static double getDouble(byte[] value, int pos) {
mridulm commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2094831806
Wouldn't this not impact/break existing log config files where users are
customizing the template ?
--
This is an automated message from the Apache Git Service.
To respond to the
panbingkun opened a new pull request, #46390:
URL: https://github.com/apache/spark/pull/46390
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this
guixiaowen commented on PR #46243:
URL: https://github.com/apache/spark/pull/46243#issuecomment-2094808977
@dongjoon-hyun hi, Can you help me review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
cloud-fan closed pull request #46372: [SPARK-48019][SQL][FOLLOWUP] Use
primitive arrays over object arrays when nulls exist
URL: https://github.com/apache/spark/pull/46372
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
cloud-fan commented on PR #46372:
URL: https://github.com/apache/spark/pull/46372#issuecomment-2094816913
thanks, merging to master/3.5!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan commented on code in PR #46338:
URL: https://github.com/apache/spark/pull/46338#discussion_r1590326222
##
common/variant/src/main/java/org/apache/spark/types/variant/VariantUtil.java:
##
@@ -392,21 +392,32 @@ public static double getDouble(byte[] value, int pos) {
dongjoon-hyun commented on PR #46389:
URL: https://github.com/apache/spark/pull/46389#issuecomment-2094837554
Could you review this PR, @cloud-fan ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun commented on PR #46389:
URL: https://github.com/apache/spark/pull/46389#issuecomment-2094655922
Could you review this PR, @beliefer ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun commented on PR #46389:
URL: https://github.com/apache/spark/pull/46389#issuecomment-2094646444
Could you review this `SparkR` PR too? This is the same kind of activity,
@gengliangwang .
--
This is an automated message from the Apache Git Service.
To respond to the message,
dongjoon-hyun commented on code in PR #46389:
URL: https://github.com/apache/spark/pull/46389#discussion_r1590216707
##
.github/workflows/build_and_test.yml:
##
@@ -76,17 +76,17 @@ jobs:
id: set-outputs
run: |
if [ -z "${{ inputs.jobs }}" ]; then
-
dongjoon-hyun commented on PR #45593:
URL: https://github.com/apache/spark/pull/45593#issuecomment-2094674035
Ya, I also believe this is not good for Apache Spark community in a long
term perspective.
--
This is an automated message from the Apache Git Service.
To respond to the message,
dongjoon-hyun opened a new pull request, #46389:
URL: https://github.com/apache/spark/pull/46389
### What changes were proposed in this pull request?
This PR aims to run `sparkr` only in PR builder and Daily Python CIs. In
other words, only the commit builder will skip it by default.
dongjoon-hyun commented on PR #46392:
URL: https://github.com/apache/spark/pull/46392#issuecomment-2094975220
Could you review this PR when you have some time, @viirya ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
dongjoon-hyun closed pull request #46392: [SPARK-48135][INFRA] Run `buf` and
`ui` only in PR builders and Java 21 Daily CI
URL: https://github.com/apache/spark/pull/46392
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
dongjoon-hyun commented on PR #46392:
URL: https://github.com/apache/spark/pull/46392#issuecomment-2094999355
Thank you, @viirya !
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
github-actions[bot] closed pull request #44712: [SPARK-46705][SS] Make RocksDB
State Store Compaction Less Likely to fall behind
URL: https://github.com/apache/spark/pull/44712
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
github-actions[bot] closed pull request #44337: [SPARK-42307][SQL][BUILD][DOCS]
Adding in a better name for `_LEGACY_ERROR_TEMP_2232`
URL: https://github.com/apache/spark/pull/44337
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
github-actions[bot] commented on PR #44875:
URL: https://github.com/apache/spark/pull/44875#issuecomment-2095014463
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
HyukjinKwon commented on PR #46391:
URL: https://github.com/apache/spark/pull/46391#issuecomment-2095016743
cc @itholic
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
sinaiamonkar-sai commented on PR #46391:
URL: https://github.com/apache/spark/pull/46391#issuecomment-2095018053
Thank you @dongjoon-hyun! Sure, let me add that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon opened a new pull request, #46393:
URL: https://github.com/apache/spark/pull/46393
### What changes were proposed in this pull request?
This PR proposes to upload Spark Connect log files in scheduled build for
Spark Connect
### Why are the changes needed?
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095022472
@chaoqin-li1123 Seems like this test does not work with pure Python library.
Can you see if the tests pass after removing `python/lib/py4j*.zip`?
Let me revert this for now
HyukjinKwon opened a new pull request, #46394:
URL: https://github.com/apache/spark/pull/46394
### What changes were proposed in this pull request?
This PR is a followup of https://github.com/apache/spark/pull/46298 that
properly uses the environment variable
dongjoon-hyun commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095027198
Thank you for checking and mitigating this by reverting, @HyukjinKwon .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
dongjoon-hyun closed pull request #46393: [SPARK-48136][INFRA][CONNECT] Always
upload Spark Connect log files in scheduled build for Spark Connect
URL: https://github.com/apache/spark/pull/46393
--
This is an automated message from the Apache Git Service.
To respond to the message, please
dongjoon-hyun commented on PR #46393:
URL: https://github.com/apache/spark/pull/46393#issuecomment-2095028010
Merged to master for Apache Spark 4.0.0-preview preparation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HyukjinKwon closed pull request #46394: [SPARK-48054][INFRA][FOLLOW-UP] Rename
SPARK_SKIP_JVM_REQUIRED_TESTS to SPARK_SKIP_CONNECT_COMPAT_TESTS
URL: https://github.com/apache/spark/pull/46394
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
HyukjinKwon commented on PR #46394:
URL: https://github.com/apache/spark/pull/46394#issuecomment-2095028367
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
zhengruifeng commented on PR #46367:
URL: https://github.com/apache/spark/pull/46367#issuecomment-2095031733
late LGTM, thanks @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dongjoon-hyun commented on PR #46367:
URL: https://github.com/apache/spark/pull/46367#issuecomment-2095034994
Thanks, @zhengruifeng
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
viirya commented on code in PR #39615:
URL: https://github.com/apache/spark/pull/39615#discussion_r1590399660
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala:
##
@@ -166,317 +148,58 @@ object JavaTypeInference {
viirya commented on code in PR #39615:
URL: https://github.com/apache/spark/pull/39615#discussion_r1590399793
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala:
##
@@ -166,317 +148,58 @@ object JavaTypeInference {
viirya commented on code in PR #39615:
URL: https://github.com/apache/spark/pull/39615#discussion_r1590400148
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala:
##
@@ -166,317 +148,58 @@ object JavaTypeInference {
viirya commented on code in PR #39615:
URL: https://github.com/apache/spark/pull/39615#discussion_r1590400148
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala:
##
@@ -166,317 +148,58 @@ object JavaTypeInference {
viirya commented on code in PR #39615:
URL: https://github.com/apache/spark/pull/39615#discussion_r1590399660
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala:
##
@@ -166,317 +148,58 @@ object JavaTypeInference {
dongjoon-hyun commented on PR #46389:
URL: https://github.com/apache/spark/pull/46389#issuecomment-2094933246
Could you review this when you have some time, @viirya ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
dongjoon-hyun commented on PR #46389:
URL: https://github.com/apache/spark/pull/46389#issuecomment-2094935863
Thank you so much, @viirya !
Merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
dongjoon-hyun closed pull request #46389: [SPARK-48133][INFRA] Run `sparkr`
only in PR builders and Daily CIs
URL: https://github.com/apache/spark/pull/46389
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dongjoon-hyun opened a new pull request, #46392:
URL: https://github.com/apache/spark/pull/46392
…
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
srielau commented on PR #46267:
URL: https://github.com/apache/spark/pull/46267#issuecomment-2094939625
@cloud-fan @gengliangwang This is ready now. Please review.
Docs are included in this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message,
batubond007 commented on PR #46370:
URL: https://github.com/apache/spark/pull/46370#issuecomment-2094943570
Hi @dongjoon-hyun , can you please review this pr which is also my first one?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
sinaiamonkar-sai opened a new pull request, #46391:
URL: https://github.com/apache/spark/pull/46391
### What changes were proposed in this pull request?
In a Scenario where we use GroupBy in PySpark API with relabeling of
aggregate columns and using as_index = False,
the columns with
sinaiamonkar-sai commented on PR #46391:
URL: https://github.com/apache/spark/pull/46391#issuecomment-2094885603
Hello, @holdenk ! This is my first Spark PR. Can you please review it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
69 matches
Mail list logo