[jira] [Resolved] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-45508. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43344 [https://github.com/apache/spark/pull/43344] > Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can > access cleaner on Java 9+ > -- > > Key: SPARK-45508 > URL: https://issues.apache.org/jira/browse/SPARK-45508 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Josh Rosen >Assignee: Josh Rosen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > We need to add `--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED` to our > JVM options so that the code in `org.apache.spark.unsafe.Platform` can access > the JDK internal cleaner classes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-45508: Assignee: Josh Rosen > Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can > access cleaner on Java 9+ > -- > > Key: SPARK-45508 > URL: https://issues.apache.org/jira/browse/SPARK-45508 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Josh Rosen >Assignee: Josh Rosen >Priority: Major > Labels: pull-request-available > > We need to add `--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED` to our > JVM options so that the code in `org.apache.spark.unsafe.Platform` can access > the JDK internal cleaner classes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer
[ https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45526: Assignee: Hyukjin Kwon (was: Allison Wang) > Refine docstring of `options` for dataframe reader and writer > - > > Key: SPARK-45526 > URL: https://issues.apache.org/jira/browse/SPARK-45526 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Refine the docstring of the `options` method of DataFrameReader and > DataFrameWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer
[ https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45526: Assignee: Allison Wang > Refine docstring of `options` for dataframe reader and writer > - > > Key: SPARK-45526 > URL: https://issues.apache.org/jira/browse/SPARK-45526 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > > Refine the docstring of the `options` method of DataFrameReader and > DataFrameWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer
[ https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45526. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43358 [https://github.com/apache/spark/pull/43358] > Refine docstring of `options` for dataframe reader and writer > - > > Key: SPARK-45526 > URL: https://issues.apache.org/jira/browse/SPARK-45526 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Refine the docstring of the `options` method of DataFrameReader and > DataFrameWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45530) Use `java.lang.ref.Cleaner` instead of `finalize` for `NioBufferedFileInputStream`
[ https://issues.apache.org/jira/browse/SPARK-45530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45530: --- Labels: pull-request-available (was: ) > Use `java.lang.ref.Cleaner` instead of `finalize` for > `NioBufferedFileInputStream` > -- > > Key: SPARK-45530 > URL: https://issues.apache.org/jira/browse/SPARK-45530 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45532) Restore codetabs for the Protobuf Data Source Guide
[ https://issues.apache.org/jira/browse/SPARK-45532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45532: --- Labels: pull-request-available (was: ) > Restore codetabs for the Protobuf Data Source Guide > --- > > Key: SPARK-45532 > URL: https://issues.apache.org/jira/browse/SPARK-45532 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31357) DataSourceV2: Catalog API for view metadata
[ https://issues.apache.org/jira/browse/SPARK-31357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-31357: --- Labels: SPIP pull-request-available (was: SPIP) > DataSourceV2: Catalog API for view metadata > --- > > Key: SPARK-31357 > URL: https://issues.apache.org/jira/browse/SPARK-31357 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: John Zhuge >Priority: Major > Labels: SPIP, pull-request-available > > SPARK-24252 added a catalog plugin system and `TableCatalog` API that > provided table metadata to Spark. This JIRA adds `ViewCatalog` API for view > metadata. > Details in [SPIP > document|https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45534) Use `java.lang.ref.Cleaner` instead of `finalize` for `RemoteBlockPushResolver`
Yang Jie created SPARK-45534: Summary: Use `java.lang.ref.Cleaner` instead of `finalize` for `RemoteBlockPushResolver` Key: SPARK-45534 URL: https://issues.apache.org/jira/browse/SPARK-45534 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45533) Use `j.l.r.Cleaner` instead of `finalize` for `RocksDBIterator/LevelDBIterator`
Yang Jie created SPARK-45533: Summary: Use `j.l.r.Cleaner` instead of `finalize` for `RocksDBIterator/LevelDBIterator` Key: SPARK-45533 URL: https://issues.apache.org/jira/browse/SPARK-45533 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45532) Restore codetabs for the Protobuf Data Source Guide
Kent Yao created SPARK-45532: Summary: Restore codetabs for the Protobuf Data Source Guide Key: SPARK-45532 URL: https://issues.apache.org/jira/browse/SPARK-45532 Project: Spark Issue Type: Improvement Components: Documentation Affects Versions: 4.0.0 Reporter: Kent Yao -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45524) Initial support for Python data source read API
[ https://issues.apache.org/jira/browse/SPARK-45524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45524: --- Labels: pull-request-available (was: ) > Initial support for Python data source read API > --- > > Key: SPARK-45524 > URL: https://issues.apache.org/jira/browse/SPARK-45524 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Priority: Major > Labels: pull-request-available > > Support Python data source API for reading data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45531) Add more comments and rename some variable name for InjectRuntimeFilter
[ https://issues.apache.org/jira/browse/SPARK-45531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45531: --- Labels: pull-request-available (was: ) > Add more comments and rename some variable name for InjectRuntimeFilter > --- > > Key: SPARK-45531 > URL: https://issues.apache.org/jira/browse/SPARK-45531 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Assignee: Jiaan Geng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45531) Add more comments and rename some variable name for InjectRuntimeFilter
Jiaan Geng created SPARK-45531: -- Summary: Add more comments and rename some variable name for InjectRuntimeFilter Key: SPARK-45531 URL: https://issues.apache.org/jira/browse/SPARK-45531 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: Jiaan Geng Assignee: Jiaan Geng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45530) Use `java.lang.ref.Cleaner` instead of `finalize` for `NioBufferedFileInputStream`
Yang Jie created SPARK-45530: Summary: Use `java.lang.ref.Cleaner` instead of `finalize` for `NioBufferedFileInputStream` Key: SPARK-45530 URL: https://issues.apache.org/jira/browse/SPARK-45530 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45498) Followup: Ignore task completion from old stage after retrying indeterminate stages
[ https://issues.apache.org/jira/browse/SPARK-45498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45498. - Fix Version/s: 3.5.1 4.0.0 Resolution: Fixed Issue resolved by pull request 43326 [https://github.com/apache/spark/pull/43326] > Followup: Ignore task completion from old stage after retrying indeterminate > stages > --- > > Key: SPARK-45498 > URL: https://issues.apache.org/jira/browse/SPARK-45498 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0, 3.5.1 >Reporter: Mayur Bhosale >Assignee: Mayur Bhosale >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.1, 4.0.0 > > > With SPARK-45182, we added a fix for not letting laggard tasks of the older > attempts of the indeterminate stage from marking the partition has completed > in map output tracker. > When a task completes, DAG scheduler also notifies all the tasksets of the > stage about that partition being completed. Tasksets would not schedule such > task if they are not already scheduled. This is not correct for indeterminate > stage, since we want to re-run all the tasks on re-attempt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45498) Followup: Ignore task completion from old stage after retrying indeterminate stages
[ https://issues.apache.org/jira/browse/SPARK-45498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-45498: --- Assignee: Mayur Bhosale > Followup: Ignore task completion from old stage after retrying indeterminate > stages > --- > > Key: SPARK-45498 > URL: https://issues.apache.org/jira/browse/SPARK-45498 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0, 3.5.1 >Reporter: Mayur Bhosale >Assignee: Mayur Bhosale >Priority: Minor > Labels: pull-request-available > > With SPARK-45182, we added a fix for not letting laggard tasks of the older > attempts of the indeterminate stage from marking the partition has completed > in map output tracker. > When a task completes, DAG scheduler also notifies all the tasksets of the > stage about that partition being completed. Tasksets would not schedule such > task if they are not already scheduled. This is not correct for indeterminate > stage, since we want to re-run all the tasks on re-attempt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary
[ https://issues.apache.org/jira/browse/SPARK-45528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45528. -- Resolution: Duplicate > Improve the example of DataFrameReader/Writer.options to take a dictionary > -- > > Key: SPARK-45528 > URL: https://issues.apache.org/jira/browse/SPARK-45528 > Project: Spark > Issue Type: Documentation > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > for example, spark.read.options(**dictionary) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45529) Fix flaky KafkaSourceStressSuite
Deng Ziming created SPARK-45529: --- Summary: Fix flaky KafkaSourceStressSuite Key: SPARK-45529 URL: https://issues.apache.org/jira/browse/SPARK-45529 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Deng Ziming test("stress test with multiple topics and partitions") in KafkaSourceStressSuite is flaky, when we increase the `iterations` from 50 to 100, it will always fail locally. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary
[ https://issues.apache.org/jira/browse/SPARK-45528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45528: --- Labels: pull-request-available (was: ) > Improve the example of DataFrameReader/Writer.options to take a dictionary > -- > > Key: SPARK-45528 > URL: https://issues.apache.org/jira/browse/SPARK-45528 > Project: Spark > Issue Type: Documentation > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > for example, spark.read.options(**dictionary) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer
[ https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45526: --- Labels: pull-request-available (was: ) > Refine docstring of `options` for dataframe reader and writer > - > > Key: SPARK-45526 > URL: https://issues.apache.org/jira/browse/SPARK-45526 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Priority: Major > Labels: pull-request-available > > Refine the docstring of the `options` method of DataFrameReader and > DataFrameWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary
Hyukjin Kwon created SPARK-45528: Summary: Improve the example of DataFrameReader/Writer.options to take a dictionary Key: SPARK-45528 URL: https://issues.apache.org/jira/browse/SPARK-45528 Project: Spark Issue Type: Documentation Components: Documentation, PySpark Affects Versions: 4.0.0 Reporter: Hyukjin Kwon for example, spark.read.options(**dictionary) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45527) Task fraction resource request is not expected
[ https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774756#comment-17774756 ] wuyi commented on SPARK-45527: -- cc [~wbo4958] [~tgraves] > Task fraction resource request is not expected > -- > > Key: SPARK-45527 > URL: https://issues.apache.org/jira/browse/SPARK-45527 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.2.1, 3.3.3, 3.4.1, 3.5.0 >Reporter: wuyi >Priority: Major > > > {code:java} > test("SPARK-XXX") { > import org.apache.spark.resource.{ResourceProfileBuilder, > TaskResourceRequests} > withTempDir { dir => > val scriptPath = createTempScriptWithExpectedOutput(dir, > "gpuDiscoveryScript", > """{"name": "gpu","addresses":["0"]}""") > val conf = new SparkConf() > .setAppName("test") > .setMaster("local-cluster[1, 12, 1024]") > .set("spark.executor.cores", "12") > conf.set(TASK_GPU_ID.amountConf, "0.08") > conf.set(WORKER_GPU_ID.amountConf, "1") > conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) > conf.set(EXECUTOR_GPU_ID.amountConf, "1") > sc = new SparkContext(conf) > val rdd = sc.range(0, 100, 1, 4) > var rdd1 = rdd.repartition(3) > val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) > val rp = new ResourceProfileBuilder().require(treqs).build > rdd1 = rdd1.withResources(rp) > assert(rdd1.collect().size === 100) > } > } {code} > In the above test, the 3 tasks generated by rdd1 are expected to be executed > in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", > 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, > those 3 tasks are run in parallel in fact. > The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. > In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't > change even if there's a new task resource request (e.g., resource("gpu", > 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45527) Task fraction resource request is not expected
wuyi created SPARK-45527: Summary: Task fraction resource request is not expected Key: SPARK-45527 URL: https://issues.apache.org/jira/browse/SPARK-45527 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.0, 3.4.1, 3.3.3, 3.2.1 Reporter: wuyi {code:java} test("SPARK-XXX") { import org.apache.spark.resource.{ResourceProfileBuilder, TaskResourceRequests} withTempDir { dir => val scriptPath = createTempScriptWithExpectedOutput(dir, "gpuDiscoveryScript", """{"name": "gpu","addresses":["0"]}""") val conf = new SparkConf() .setAppName("test") .setMaster("local-cluster[1, 12, 1024]") .set("spark.executor.cores", "12") conf.set(TASK_GPU_ID.amountConf, "0.08") conf.set(WORKER_GPU_ID.amountConf, "1") conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) conf.set(EXECUTOR_GPU_ID.amountConf, "1") sc = new SparkContext(conf) val rdd = sc.range(0, 100, 1, 4) var rdd1 = rdd.repartition(3) val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) val rp = new ResourceProfileBuilder().require(treqs).build rdd1 = rdd1.withResources(rp) assert(rdd1.collect().size === 100) } } {code} In the above test, the 3 tasks generated by rdd1 are expected to be executed in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 tasks are run in parallel in fact. The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change even if there's a new task resource request (e.g., resource("gpu", 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45527) Task fraction resource request is not expected
[ https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-45527: - Description: {code:java} test("SPARK-XXX") { import org.apache.spark.resource.{ResourceProfileBuilder, TaskResourceRequests} withTempDir { dir => val scriptPath = createTempScriptWithExpectedOutput(dir, "gpuDiscoveryScript", """{"name": "gpu","addresses":["0"]}""") val conf = new SparkConf() .setAppName("test") .setMaster("local-cluster[1, 12, 1024]") .set("spark.executor.cores", "12") conf.set(TASK_GPU_ID.amountConf, "0.08") conf.set(WORKER_GPU_ID.amountConf, "1") conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) conf.set(EXECUTOR_GPU_ID.amountConf, "1") sc = new SparkContext(conf) val rdd = sc.range(0, 100, 1, 4) var rdd1 = rdd.repartition(3) val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) val rp = new ResourceProfileBuilder().require(treqs).build rdd1 = rdd1.withResources(rp) assert(rdd1.collect().size === 100) } } {code} In the above test, the 3 tasks generated by rdd1 are expected to be executed in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 tasks are run in parallel in fact. The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change even if there's a new task resource request (e.g., resource("gpu", 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel. was: {code:java} test("SPARK-XXX") { import org.apache.spark.resource.{ResourceProfileBuilder, TaskResourceRequests} withTempDir { dir => val scriptPath = createTempScriptWithExpectedOutput(dir, "gpuDiscoveryScript", """{"name": "gpu","addresses":["0"]}""") val conf = new SparkConf() .setAppName("test") .setMaster("local-cluster[1, 12, 1024]") .set("spark.executor.cores", "12") conf.set(TASK_GPU_ID.amountConf, "0.08") conf.set(WORKER_GPU_ID.amountConf, "1") conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) conf.set(EXECUTOR_GPU_ID.amountConf, "1") sc = new SparkContext(conf) val rdd = sc.range(0, 100, 1, 4) var rdd1 = rdd.repartition(3) val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) val rp = new ResourceProfileBuilder().require(treqs).build rdd1 = rdd1.withResources(rp) assert(rdd1.collect().size === 100) } } {code} In the above test, the 3 tasks generated by rdd1 are expected to be executed in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 tasks are run in parallel in fact. The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change even if there's a new task resource request (e.g., resource("gpu", 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel. > Task fraction resource request is not expected > -- > > Key: SPARK-45527 > URL: https://issues.apache.org/jira/browse/SPARK-45527 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.2.1, 3.3.3, 3.4.1, 3.5.0 >Reporter: wuyi >Priority: Major > > > {code:java} > test("SPARK-XXX") { > import org.apache.spark.resource.{ResourceProfileBuilder, > TaskResourceRequests} > withTempDir { dir => > val scriptPath = createTempScriptWithExpectedOutput(dir, > "gpuDiscoveryScript", > """{"name": "gpu","addresses":["0"]}""") > val conf = new SparkConf() > .setAppName("test") > .setMaster("local-cluster[1, 12, 1024]") > .set("spark.executor.cores", "12") > conf.set(TASK_GPU_ID.amountConf, "0.08") > conf.set(WORKER_GPU_ID.amountConf, "1") > conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) > conf.set(EXECUTOR_GPU_ID.amountConf, "1") > sc = new SparkContext(conf) > val rdd = sc.range(0, 100, 1, 4) > var rdd1 = rdd.repartition(3) > val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) > val rp = new ResourceProfileBuilder().require(treqs).build > rdd1 = rdd1.withResources(rp) > assert(rdd1.collect().size === 100) > } > } {code} > In the above test, the 3 tasks generated by rdd1 are expected to be executed > in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", > 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, > those 3 tasks are run in parallel in fact. > The root cause is that
[jira] [Created] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer
Allison Wang created SPARK-45526: Summary: Refine docstring of `options` for dataframe reader and writer Key: SPARK-45526 URL: https://issues.apache.org/jira/browse/SPARK-45526 Project: Spark Issue Type: Sub-task Components: Documentation, PySpark Affects Versions: 4.0.0 Reporter: Allison Wang Refine the docstring of the `options` method of DataFrameReader and DataFrameWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45515: --- Labels: pull-request-available (was: ) > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-45515. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43349 [https://github.com/apache/spark/pull/43349] > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-45515: Assignee: Yang Jie > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45525) Initial support for Python data source write API
Allison Wang created SPARK-45525: Summary: Initial support for Python data source write API Key: SPARK-45525 URL: https://issues.apache.org/jira/browse/SPARK-45525 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 4.0.0 Reporter: Allison Wang Support for Python data source write API -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45524) Initial support for Python data source read API
Allison Wang created SPARK-45524: Summary: Initial support for Python data source read API Key: SPARK-45524 URL: https://issues.apache.org/jira/browse/SPARK-45524 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 4.0.0 Reporter: Allison Wang Support Python data source API for reading data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler
[ https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-45521: - Assignee: Ruifeng Zheng > Avoid re-computation of nnz in VectorAssembler > -- > > Key: SPARK-45521 > URL: https://issues.apache.org/jira/browse/SPARK-45521 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler
[ https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-45521. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43353 [https://github.com/apache/spark/pull/43353] > Avoid re-computation of nnz in VectorAssembler > -- > > Key: SPARK-45521 > URL: https://issues.apache.org/jira/browse/SPARK-45521 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45418) Change CURRENT_SCHEMA() column alias to match function name
[ https://issues.apache.org/jira/browse/SPARK-45418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45418. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43235 [https://github.com/apache/spark/pull/43235] > Change CURRENT_SCHEMA() column alias to match function name > --- > > Key: SPARK-45418 > URL: https://issues.apache.org/jira/browse/SPARK-45418 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.5.0 >Reporter: Michael Zhang >Assignee: Michael Zhang >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45418) Change CURRENT_SCHEMA() column alias to match function name
[ https://issues.apache.org/jira/browse/SPARK-45418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-45418: --- Assignee: Michael Zhang > Change CURRENT_SCHEMA() column alias to match function name > --- > > Key: SPARK-45418 > URL: https://issues.apache.org/jira/browse/SPARK-45418 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.5.0 >Reporter: Michael Zhang >Assignee: Michael Zhang >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34612) Whether to expose outputDeterministicLevel so custom RDDs can set deterministic level
[ https://issues.apache.org/jira/browse/SPARK-34612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-34612: --- Labels: pull-request-available (was: ) > Whether to expose outputDeterministicLevel so custom RDDs can set > deterministic level > - > > Key: SPARK-34612 > URL: https://issues.apache.org/jira/browse/SPARK-34612 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.2.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > > This ticket is open to track a TODO item in RDD.outputDeterministicLevel. > We need to decide if we want to expose it so users can set deterministic > level to their custom RDDs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45505) Refactor analyzeInPython function to make it reusable
[ https://issues.apache.org/jira/browse/SPARK-45505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-45505. --- Fix Version/s: 4.0.0 Assignee: Allison Wang Resolution: Fixed Issue resolved by pull request 43340 https://github.com/apache/spark/pull/43340 > Refactor analyzeInPython function to make it reusable > - > > Key: SPARK-45505 > URL: https://issues.apache.org/jira/browse/SPARK-45505 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Refactor analyzeInPython method in UserDefinedPythonTableFunction object into > an abstract class so that it can be reused in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45523) Return useful error message if UDTF returns None for non-nullable column
[ https://issues.apache.org/jira/browse/SPARK-45523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45523: --- Labels: pull-request-available (was: ) > Return useful error message if UDTF returns None for non-nullable column > > > Key: SPARK-45523 > URL: https://issues.apache.org/jira/browse/SPARK-45523 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Daniel >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45523) Return useful error message if UDTF returns None for non-nullable column
Daniel created SPARK-45523: -- Summary: Return useful error message if UDTF returns None for non-nullable column Key: SPARK-45523 URL: https://issues.apache.org/jira/browse/SPARK-45523 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Daniel -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45516) Include QueryContext in SparkThrowable proto message
[ https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45516. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43352 [https://github.com/apache/spark/pull/43352] > Include QueryContext in SparkThrowable proto message > > > Key: SPARK-45516 > URL: https://issues.apache.org/jira/browse/SPARK-45516 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Assignee: Yihong He >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45516) Include QueryContext in SparkThrowable proto message
[ https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45516: Assignee: Yihong He > Include QueryContext in SparkThrowable proto message > > > Key: SPARK-45516 > URL: https://issues.apache.org/jira/browse/SPARK-45516 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Assignee: Yihong He >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] (SPARK-44594) Remove redundant method parameter in kafka connector
[ https://issues.apache.org/jira/browse/SPARK-44594 ] Philip Dakin deleted comment on SPARK-44594: -- was (Author: JIRAUSER302581): Related PR: https://github.com/apache/spark/pull/42198 > Remove redundant method parameter in kafka connector > > > Key: SPARK-44594 > URL: https://issues.apache.org/jira/browse/SPARK-44594 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 4.0.0 >Reporter: Min Zhao >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > > There are have redundant parameters in > org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and > org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are > not used, remove them to make the code more concise. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-44594) Remove redundant method parameter in kafka connector
[ https://issues.apache.org/jira/browse/SPARK-44594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774695#comment-17774695 ] Philip Dakin commented on SPARK-44594: -- Related PR: https://github.com/apache/spark/pull/42198 > Remove redundant method parameter in kafka connector > > > Key: SPARK-44594 > URL: https://issues.apache.org/jira/browse/SPARK-44594 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 4.0.0 >Reporter: Min Zhao >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > > There are have redundant parameters in > org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and > org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are > not used, remove them to make the code more concise. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44594) Remove redundant method parameter in kafka connector
[ https://issues.apache.org/jira/browse/SPARK-44594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44594: --- Labels: pull-request-available (was: ) > Remove redundant method parameter in kafka connector > > > Key: SPARK-44594 > URL: https://issues.apache.org/jira/browse/SPARK-44594 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 4.0.0 >Reporter: Min Zhao >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > > There are have redundant parameters in > org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and > org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are > not used, remove them to make the code more concise. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45506) Support ivy URIs in SparkConnect addArtifact
[ https://issues.apache.org/jira/browse/SPARK-45506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45506: --- Labels: pull-request-available (was: ) > Support ivy URIs in SparkConnect addArtifact > > > Key: SPARK-45506 > URL: https://issues.apache.org/jira/browse/SPARK-45506 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Vsevolod Stepanov >Priority: Major > Labels: pull-request-available > > Right now Spark Connect's addArtifact API supports only adding .jar & .class > files. It would be useful to extend this API to support adding arbitrary > Maven artifacts using Ivy -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45512) Fix compilation warnings related to other-nullary-override
[ https://issues.apache.org/jira/browse/SPARK-45512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45512: --- Labels: pull-request-available (was: ) > Fix compilation warnings related to other-nullary-override > -- > > Key: SPARK-45512 > URL: https://issues.apache.org/jira/browse/SPARK-45512 > Project: Spark > Issue Type: Sub-task > Components: DStreams, Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > > {code:java} > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16: > method with a single empty parameter list overrides method hasNext in trait > Iterator defined without a parameter list [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.WrappedCloseableIterator > [error] override def hasNext(): Boolean = innerIterator.hasNext > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator > [error] override def hasNext: Boolean = synchronized { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator > [error] override def hasNext: Boolean = { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18: > method without a parameter list overrides method next in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator > [error] override def next: U = { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator > [error] override def hasNext: Boolean = { > [error] ^ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-45501. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43327 [https://github.com/apache/spark/pull/43327] > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-45501: Assignee: Yang Jie > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45502) Upgrade Kafka to 3.6.0
[ https://issues.apache.org/jira/browse/SPARK-45502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-45502. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43348 [https://github.com/apache/spark/pull/43348] > Upgrade Kafka to 3.6.0 > -- > > Key: SPARK-45502 > URL: https://issues.apache.org/jira/browse/SPARK-45502 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Deng Ziming >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Apache Kafka 3.6.0 is released on Oct 10, 2023. > - https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45522) Migrate jetty 9 to jetty 12
[ https://issues.apache.org/jira/browse/SPARK-45522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45522: - Description: Jetty 12 supports JakartaEE 8/JakartaEE 9/JakartaEE 10 simultaneously. But the version span is quite large, need to read the documentation in detail, not sure if it can be completed within the 4.0 cycle, so it's set to low priority. > Migrate jetty 9 to jetty 12 > --- > > Key: SPARK-45522 > URL: https://issues.apache.org/jira/browse/SPARK-45522 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > Jetty 12 supports JakartaEE 8/JakartaEE 9/JakartaEE 10 simultaneously. But > the version span is quite large, need to read the documentation in detail, > not sure if it can be completed within the 4.0 cycle, so it's set to low > priority. > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45522) Migrate jetty 9 to jetty 12
[ https://issues.apache.org/jira/browse/SPARK-45522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45522: - Priority: Minor (was: Major) > Migrate jetty 9 to jetty 12 > --- > > Key: SPARK-45522 > URL: https://issues.apache.org/jira/browse/SPARK-45522 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45132) Fix IDENTIFIER clause for functions
[ https://issues.apache.org/jira/browse/SPARK-45132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-45132: --- Assignee: Serge Rielau > Fix IDENTIFIER clause for functions > --- > > Key: SPARK-45132 > URL: https://issues.apache.org/jira/browse/SPARK-45132 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Serge Rielau >Assignee: Serge Rielau >Priority: Major > Labels: pull-request-available > > Due to a quirk in the grammar IDENTIFIER('foo')() does not resolve > depending on . > Example: > SELECT IDENTIFIER('abs')(-1) works, but > SELECT IDENTIFIER('abs')(c1) FROM VALUES(-1) AS T(c1) does not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler
[ https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45521: --- Labels: pull-request-available (was: ) > Avoid re-computation of nnz in VectorAssembler > -- > > Key: SPARK-45521 > URL: https://issues.apache.org/jira/browse/SPARK-45521 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45132) Fix IDENTIFIER clause for functions
[ https://issues.apache.org/jira/browse/SPARK-45132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45132. - Fix Version/s: 3.5.1 4.0.0 Resolution: Fixed Issue resolved by pull request 42888 [https://github.com/apache/spark/pull/42888] > Fix IDENTIFIER clause for functions > --- > > Key: SPARK-45132 > URL: https://issues.apache.org/jira/browse/SPARK-45132 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Serge Rielau >Assignee: Serge Rielau >Priority: Major > Labels: pull-request-available > Fix For: 3.5.1, 4.0.0 > > > Due to a quirk in the grammar IDENTIFIER('foo')() does not resolve > depending on . > Example: > SELECT IDENTIFIER('abs')(-1) works, but > SELECT IDENTIFIER('abs')(c1) FROM VALUES(-1) AS T(c1) does not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45522) Migrate jetty 9 to jetty 12
Yang Jie created SPARK-45522: Summary: Migrate jetty 9 to jetty 12 Key: SPARK-45522 URL: https://issues.apache.org/jira/browse/SPARK-45522 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 4.0.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler
Ruifeng Zheng created SPARK-45521: - Summary: Avoid re-computation of nnz in VectorAssembler Key: SPARK-45521 URL: https://issues.apache.org/jira/browse/SPARK-45521 Project: Spark Issue Type: Improvement Components: ML Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45520) Add property testing for error constructors in scala clients
Yihong He created SPARK-45520: - Summary: Add property testing for error constructors in scala clients Key: SPARK-45520 URL: https://issues.apache.org/jira/browse/SPARK-45520 Project: Spark Issue Type: Test Components: Connect Affects Versions: 4.0.0 Reporter: Yihong He -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45519) cleanSource problem on FileStreamSource for Windows env
Yunus Emre Gürses created SPARK-45519: - Summary: cleanSource problem on FileStreamSource for Windows env Key: SPARK-45519 URL: https://issues.apache.org/jira/browse/SPARK-45519 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.4.1 Reporter: Yunus Emre Gürses We are using Spark with Scala in Windows environment. While streaming using Spark, I give the *{{cleanSource}}* option as "archive" and the *{{sourceArchiveDir}}* option as "archived" as in the code below. {code:java} spark.readStream .option("cleanSource", "archive") .option("sourceArchiveDir", "archived"){code} When I tried this in a Linux environment, I realized that the problem was with the paths. Because when I set archive mode to "delete", it works on both Linux and Windows. But for the archive mode, it does not work on Windows. The problem is related to appending paths in Windows. There is a method {code:java} override protected def cleanTask(entry: FileEntry): Unit{code} in the FileStreamSource.scala file in the org.apache.spark.sql.execution.streaming package. On line 569, the !fileSystem.rename(curPath, newPath) code supposed to move source file to archive folder. However, when I debugged, I noticed that the curPath and newPath values were as follows in windows: {code:java} curPath: file:/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv{code} {code:java} newPath: file:/C:/dev/be/data-integration-suite/archived/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv{code} It seems that absolute path of csv file were appended when creating newPath because there are two *C:/dev/be/data-integration-suite* in the newPath. This is the reason probably spark archiving does not work. Instead, newPath should be: file:/C:/dev/be/data-integration-suite/archived/test-data/streaming-folder/patients/patients-success.csv -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45518) Error framework support for Python Spark Connect Client
[ https://issues.apache.org/jira/browse/SPARK-45518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yihong He updated SPARK-45518: -- Description: * Define APIs for the error framework, especially where to expose QueryContext * Refactor the exception constructors to support error framework parameters * Reconstruct exceptions with error framework parameters in FetchErrorDetailResponses > Error framework support for Python Spark Connect Client > --- > > Key: SPARK-45518 > URL: https://issues.apache.org/jira/browse/SPARK-45518 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Priority: Major > > * Define APIs for the error framework, especially where to expose QueryContext > * Refactor the exception constructors to support error framework parameters > * Reconstruct exceptions with error framework parameters in > FetchErrorDetailResponses -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45517) Expand more exception constructors to support error framework parameters
[ https://issues.apache.org/jira/browse/SPARK-45517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yihong He updated SPARK-45517: -- Description: * SparkNumberFormatException * SparkIllegalArgumentException * SparkArithmeticException * SparkUnsupportedOperationException * SparkArrayIndexOutOfBoundsException * SparkDateTimeException * SparkRuntimeException * SparkUpgradeException was: * SparkNumberFormatException * SparkIllegalArgumentException * SparkArithmeticException * SparkUnsupportedOperationException * SparkArrayIndexOutOfBoundsException * SparkDateTimeException > Expand more exception constructors to support error framework parameters > > > Key: SPARK-45517 > URL: https://issues.apache.org/jira/browse/SPARK-45517 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Priority: Major > > * SparkNumberFormatException > * SparkIllegalArgumentException > * SparkArithmeticException > * SparkUnsupportedOperationException > * SparkArrayIndexOutOfBoundsException > * SparkDateTimeException > * SparkRuntimeException > * SparkUpgradeException -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45518) Error framework support for Python Spark Connect Client
Yihong He created SPARK-45518: - Summary: Error framework support for Python Spark Connect Client Key: SPARK-45518 URL: https://issues.apache.org/jira/browse/SPARK-45518 Project: Spark Issue Type: New Feature Components: Connect Affects Versions: 4.0.0 Reporter: Yihong He -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45517) Expand more exception constructors to support error framework parameters
Yihong He created SPARK-45517: - Summary: Expand more exception constructors to support error framework parameters Key: SPARK-45517 URL: https://issues.apache.org/jira/browse/SPARK-45517 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: Yihong He * SparkNumberFormatException * SparkIllegalArgumentException * SparkArithmeticException * SparkUnsupportedOperationException * SparkArrayIndexOutOfBoundsException * SparkDateTimeException -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45516) Include QueryContext in SparkThrowable proto message
[ https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45516: --- Labels: pull-request-available (was: ) > Include QueryContext in SparkThrowable proto message > > > Key: SPARK-45516 > URL: https://issues.apache.org/jira/browse/SPARK-45516 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Yihong He >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45433) CSV/JSON schema inference when timestamps do not match specified timestampFormat with only one row on each partition report error
[ https://issues.apache.org/jira/browse/SPARK-45433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-45433: - Fix Version/s: 3.4.2 > CSV/JSON schema inference when timestamps do not match specified > timestampFormat with only one row on each partition report error > - > > Key: SPARK-45433 > URL: https://issues.apache.org/jira/browse/SPARK-45433 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.4.0, 3.5.0 >Reporter: Jia Fan >Assignee: Jia Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.2, 4.0.0, 3.5.1 > > > CSV/JSON schema inference when timestamps do not match specified > timestampFormat with `only one row on each partition` report error. > {code:java} > //eg > val csv = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss") > .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS()) > csv.show() {code} > {code:java} > //error > Caused by: java.time.format.DateTimeParseException: Text > '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index > 19 {code} > This bug affect 3.3/3.4/3.5. Unlike > https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug > but has the same error message -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45516) Include QueryContext in SparkThrowable proto message
Yihong He created SPARK-45516: - Summary: Include QueryContext in SparkThrowable proto message Key: SPARK-45516 URL: https://issues.apache.org/jira/browse/SPARK-45516 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: Yihong He -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-43664) Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests
[ https://issues.apache.org/jira/browse/SPARK-43664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-43664: Assignee: Haejoon Lee > Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests > --- > > Key: SPARK-43664 > URL: https://issues.apache.org/jira/browse/SPARK-43664 > Project: Spark > Issue Type: Sub-task > Components: Connect, Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > > Repro: run `SQLParityTests.test_sql_with_index_col` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-43664) Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests
[ https://issues.apache.org/jira/browse/SPARK-43664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-43664. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43237 [https://github.com/apache/spark/pull/43237] > Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests > --- > > Key: SPARK-43664 > URL: https://issues.apache.org/jira/browse/SPARK-43664 > Project: Spark > Issue Type: Sub-task > Components: Connect, Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Repro: run `SQLParityTests.test_sql_with_index_col` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: (was: Apache Spark) > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: Apache Spark > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45501: -- Assignee: (was: Apache Spark) > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > Labels: pull-request-available > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45501: -- Assignee: Apache Spark > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > Labels: pull-request-available > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-44649: -- Assignee: Apache Spark > Runtime Filter supports passing equivalent creation side expressions > > > Key: SPARK-44649 > URL: https://issues.apache.org/jira/browse/SPARK-44649 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > {code:java} > SELECT > d_year, > i_brand_id, > i_class_id, > i_category_id, > i_manufact_id, > cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt, > cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt > FROM catalog_sales > JOIN item ON i_item_sk = cs_item_sk > JOIN date_dim ON d_date_sk = cs_sold_date_sk > LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number > AND cs_item_sk = cr_item_sk) > WHERE i_category = 'Books' > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45501: -- Assignee: Apache Spark > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > Labels: pull-request-available > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-44649: -- Assignee: (was: Apache Spark) > Runtime Filter supports passing equivalent creation side expressions > > > Key: SPARK-44649 > URL: https://issues.apache.org/jira/browse/SPARK-44649 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Priority: Major > Labels: pull-request-available > > {code:java} > SELECT > d_year, > i_brand_id, > i_class_id, > i_category_id, > i_manufact_id, > cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt, > cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt > FROM catalog_sales > JOIN item ON i_item_sk = cs_item_sk > JOIN date_dim ON d_date_sk = cs_sold_date_sk > LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number > AND cs_item_sk = cr_item_sk) > WHERE i_category = 'Books' > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion
[ https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45501: -- Assignee: (was: Apache Spark) > Use pattern matching for type checking and conversion > - > > Key: SPARK-45501 > URL: https://issues.apache.org/jira/browse/SPARK-45501 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > Labels: pull-request-available > > Refer to [JEP 394|https://openjdk.org/jeps/394] > Example: > {code:java} > if (obj instanceof String) { > String str = (String) obj; > System.out.println(str); > } {code} > Can be replaced with > > {code:java} > if (obj instanceof String str) { > System.out.println(str); > } {code} > The new code look more compact -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-44649: -- Assignee: (was: Apache Spark) > Runtime Filter supports passing equivalent creation side expressions > > > Key: SPARK-44649 > URL: https://issues.apache.org/jira/browse/SPARK-44649 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Priority: Major > Labels: pull-request-available > > {code:java} > SELECT > d_year, > i_brand_id, > i_class_id, > i_category_id, > i_manufact_id, > cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt, > cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt > FROM catalog_sales > JOIN item ON i_item_sk = cs_item_sk > JOIN date_dim ON d_date_sk = cs_sold_date_sk > LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number > AND cs_item_sk = cr_item_sk) > WHERE i_category = 'Books' > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions
[ https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-44649: -- Assignee: Apache Spark > Runtime Filter supports passing equivalent creation side expressions > > > Key: SPARK-44649 > URL: https://issues.apache.org/jira/browse/SPARK-44649 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > {code:java} > SELECT > d_year, > i_brand_id, > i_class_id, > i_category_id, > i_manufact_id, > cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt, > cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt > FROM catalog_sales > JOIN item ON i_item_sk = cs_item_sk > JOIN date_dim ON d_date_sk = cs_sold_date_sk > LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number > AND cs_item_sk = cr_item_sk) > WHERE i_category = 'Books' > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45515: - Summary: Use enhanced `switch` expressions to replace the regular `switch` statement (was: Use `Switch Expressions` to replace the regular `switch` statement) > Use enhanced `switch` expressions to replace the regular `switch` statement > --- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: (was: Apache Spark) > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: Apache Spark > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45515: - Description: refer to [JEP 361|https://openjdk.org/jeps/361] Example: {code:java} double getPrice(String fruit) { switch (fruit) { case "Apple": return 1.0; case "Orange": return 1.5; case "Mango": return 2.0; default: throw new IllegalArgumentException(); } } {code} Can be changed to {code:java} double getPrice(String fruit) { return switch (fruit) { case "Apple" -> 1.0; case "Orange" -> 1.5; case "Mango" -> 2.0; default -> throw new IllegalArgumentException(); }; } {code} was: refer to [JEP 361|https://openjdk.org/jeps/361] Example: ```java double getPrice(String fruit) { // Switch statement can be replaced with enhanced 'switch' switch (fruit) { case "Apple": return 1.0; case "Orange": return 1.5; case "Mango": return 2.0; default: throw new IllegalArgumentException(); } } ``` Can be changed to ```java double getPrice(String fruit) { return switch (fruit) { case "Apple" -> 1.0; case "Orange" -> 1.5; case "Mango" -> 2.0; default -> throw new IllegalArgumentException(); }; } ``` > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > {code:java} > double getPrice(String fruit) { > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } {code} > Can be changed to > {code:java} > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: (was: Apache Spark) > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > ```java > double getPrice(String fruit) { > // Switch statement can be replaced with enhanced 'switch' > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } > ``` > Can be changed to > ```java > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45515: -- Assignee: Apache Spark > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > ```java > double getPrice(String fruit) { > // Switch statement can be replaced with enhanced 'switch' > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } > ``` > Can be changed to > ```java > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
[ https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774382#comment-17774382 ] ASF GitHub Bot commented on SPARK-45515: User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/43349 > Use `Switch Expressions` to replace the regular `switch` statement > -- > > Key: SPARK-45515 > URL: https://issues.apache.org/jira/browse/SPARK-45515 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > refer to [JEP 361|https://openjdk.org/jeps/361] > > Example: > ```java > double getPrice(String fruit) { > // Switch statement can be replaced with enhanced 'switch' > switch (fruit) { > case "Apple": > return 1.0; > case "Orange": > return 1.5; > case "Mango": > return 2.0; > default: > throw new IllegalArgumentException(); > } > } > ``` > Can be changed to > ```java > double getPrice(String fruit) { > return switch (fruit) { > case "Apple" -> 1.0; > case "Orange" -> 1.5; > case "Mango" -> 2.0; > default -> throw new IllegalArgumentException(); > }; > } > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45513) Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2`
[ https://issues.apache.org/jira/browse/SPARK-45513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45513: --- Labels: pull-request-available (was: ) > Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2` > --- > > Key: SPARK-45513 > URL: https://issues.apache.org/jira/browse/SPARK-45513 > Project: Spark > Issue Type: Sub-task > Components: Connect, MLlib, Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Jiaan Geng >Assignee: Jiaan Geng >Priority: Major > Labels: pull-request-available > > {code:java} > @scala.deprecated(message = "Use scala.collection.LazyZip2.", since = > "2.13.0") > final class Tuple2Zipped[El1, It1 <: scala.Iterable[El1], El2, It2 <: > scala.Iterable[El2]](colls : scala.Tuple2[It1, It2]) extends scala.AnyVal > with scala.runtime.ZippedIterable2[El1, El2] { > @scala.deprecated(message = "Use scala.collection.LazyZip2.", since = > "2.13.0") > object Tuple2Zipped extends scala.AnyRef { > final class Ops[T1, T2](x : scala.Tuple2[T1, T2]) extends scala.AnyVal { > @scala.deprecated(message = "Use xs.lazyZip(yz).map((_, _))", since = > "2.13.0") > def invert[El1, It1[a] <: scala.Iterable[a], El2, It2[a] <: > scala.Iterable[a], That](implicit w1 : scala.<:<[T1, It1[El1]], w2 : > scala.<:<[T2, It2[El2]], bf : scala.collection.BuildFrom[T1, > scala.Tuple2[El1, El2], That]) : That = { /* compiled code */ } > @scala.deprecated(message = "Use xs.lazyZip(ys)", since = "2.13.0") > def zipped[El1, It1 <: scala.Iterable[El1], El2, It2 <: > scala.Iterable[El2]](implicit w1 : scala.Function1[T1, > scala.collection.IterableOps[El1, scala.Iterable, It1] with It1], w2 : > scala.Function1[T2, scala.collection.IterableOps[El2, scala.Iterable, It2] > with It2]) : scala.runtime.Tuple2Zipped[El1, It1, El2, It2] = { /* compiled > code */ } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement
Yang Jie created SPARK-45515: Summary: Use `Switch Expressions` to replace the regular `switch` statement Key: SPARK-45515 URL: https://issues.apache.org/jira/browse/SPARK-45515 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 4.0.0 Reporter: Yang Jie refer to [JEP 361|https://openjdk.org/jeps/361] Example: ```java double getPrice(String fruit) { // Switch statement can be replaced with enhanced 'switch' switch (fruit) { case "Apple": return 1.0; case "Orange": return 1.5; case "Mango": return 2.0; default: throw new IllegalArgumentException(); } } ``` Can be changed to ```java double getPrice(String fruit) { return switch (fruit) { case "Apple" -> 1.0; case "Orange" -> 1.5; case "Mango" -> 2.0; default -> throw new IllegalArgumentException(); }; } ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44752) XML: Update Spark Docs
[ https://issues.apache.org/jira/browse/SPARK-44752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44752: --- Labels: pull-request-available (was: ) > XML: Update Spark Docs > -- > > Key: SPARK-44752 > URL: https://issues.apache.org/jira/browse/SPARK-44752 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Sandip Agarwala >Priority: Major > Labels: pull-request-available > > [https://spark.apache.org/docs/latest/sql-data-sources.html] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45510) Replace `scala.collection.generic.Growable` to `scala.collection.mutable.Growable`
[ https://issues.apache.org/jira/browse/SPARK-45510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-45510. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43347 [https://github.com/apache/spark/pull/43347] > Replace `scala.collection.generic.Growable` to > `scala.collection.mutable.Growable` > -- > > Key: SPARK-45510 > URL: https://issues.apache.org/jira/browse/SPARK-45510 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jia Fan >Assignee: Jia Fan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Replace `scala.collection.generic.Growable` to > `scala.collection.mutable.Growable` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45510) Replace `scala.collection.generic.Growable` to `scala.collection.mutable.Growable`
[ https://issues.apache.org/jira/browse/SPARK-45510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie reassigned SPARK-45510: Assignee: Jia Fan > Replace `scala.collection.generic.Growable` to > `scala.collection.mutable.Growable` > -- > > Key: SPARK-45510 > URL: https://issues.apache.org/jira/browse/SPARK-45510 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Jia Fan >Assignee: Jia Fan >Priority: Major > Labels: pull-request-available > > Replace `scala.collection.generic.Growable` to > `scala.collection.mutable.Growable` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45513) Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2`
Jiaan Geng created SPARK-45513: -- Summary: Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2` Key: SPARK-45513 URL: https://issues.apache.org/jira/browse/SPARK-45513 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Jiaan Geng Assignee: Jiaan Geng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45512) Fix compilation warnings related to other-nullary-override #43332
Yang Jie created SPARK-45512: Summary: Fix compilation warnings related to other-nullary-override #43332 Key: SPARK-45512 URL: https://issues.apache.org/jira/browse/SPARK-45512 Project: Spark Issue Type: Sub-task Components: DStreams, Spark Core, SQL Affects Versions: 4.0.0 Reporter: Yang Jie {code:java} [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16: method with a single empty parameter list overrides method hasNext in trait Iterator defined without a parameter list [quickfixable] [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, site=org.apache.spark.sql.connect.client.WrappedCloseableIterator [error] override def hasNext(): Boolean = innerIterator.hasNext [error] ^ [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16: method without a parameter list overrides method hasNext in class WrappedCloseableIterator defined with a single empty parameter list [quickfixable] [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator [error] override def hasNext: Boolean = synchronized { [error] ^ [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20: method without a parameter list overrides method hasNext in class WrappedCloseableIterator defined with a single empty parameter list [quickfixable] [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator [error] override def hasNext: Boolean = { [error] ^ [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18: method without a parameter list overrides method next in class WrappedCloseableIterator defined with a single empty parameter list [quickfixable] [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator [error] override def next: U = { [error] ^ [error] /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18: method without a parameter list overrides method hasNext in class WrappedCloseableIterator defined with a single empty parameter list [quickfixable] [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator [error] override def hasNext: Boolean = { [error] ^ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45512) Fix compilation warnings related to other-nullary-override
[ https://issues.apache.org/jira/browse/SPARK-45512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-45512: - Summary: Fix compilation warnings related to other-nullary-override (was: Fix compilation warnings related to other-nullary-override #43332) > Fix compilation warnings related to other-nullary-override > -- > > Key: SPARK-45512 > URL: https://issues.apache.org/jira/browse/SPARK-45512 > Project: Spark > Issue Type: Sub-task > Components: DStreams, Spark Core, SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > {code:java} > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16: > method with a single empty parameter list overrides method hasNext in trait > Iterator defined without a parameter list [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.WrappedCloseableIterator > [error] override def hasNext(): Boolean = innerIterator.hasNext > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator > [error] override def hasNext: Boolean = synchronized { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator > [error] override def hasNext: Boolean = { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18: > method without a parameter list overrides method next in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator > [error] override def next: U = { > [error] ^ > [error] > /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18: > method without a parameter list overrides method hasNext in class > WrappedCloseableIterator defined with a single empty parameter list > [quickfixable] > [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, > site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator > [error] override def hasNext: Boolean = { > [error] ^ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45511) SPIP: State Data Source - Reader
[ https://issues.apache.org/jira/browse/SPARK-45511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-45511: - Description: State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores. Specific to the ability to read the state, the lack of feature brings up various limitations like following: * Users are unable to see the content in the state store, leading to inability to debug. * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query. Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query. SPIP: [https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing] was: State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores. Specific to the ability to read the state, the lack of feature brings up various limitations like following: * Users are unable to see the content in the state store, leading to inability to debug. * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query. Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query. SPIP: [https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing] > SPIP: State Data Source - Reader > > > Key: SPARK-45511 > URL: https://issues.apache.org/jira/browse/SPARK-45511 > Project: Spark > Issue Type: New Feature > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: Jungtaek Lim >Priority: Major > Labels: SPIP > > State Store has been a black box from the introduction of the stateful > operator. It has been the “internal” data to the streaming query, and Spark > does not expose the data outside of the streaming query. There is no > feature/tool for users to read and modify the content of state stores. > Specific to the ability to read the state, the lack of feature brings up > various limitations like following: > * Users are unable to see the content in the state store, leading to > inability to debug. > * Users have to perform some indirect approach on verifying the content of > the state store in unit tests. The only option they can take is relying on > the output of the query. > Given that, we propose to introduce a feature which enables users to read the > state from the outside of the streaming query. > SPIP: > [https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing] > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45488) XML: Add support for value in 'rowTag' element
[ https://issues.apache.org/jira/browse/SPARK-45488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45488. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 43319 [https://github.com/apache/spark/pull/43319] > XML: Add support for value in 'rowTag' element > -- > > Key: SPARK-45488 > URL: https://issues.apache.org/jira/browse/SPARK-45488 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Sandip Agarwala >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The following XML with rowTag 'book' will yield a schema with just "_id" > column and not the value: > > {code:java} > Great Book{code} > Let's parse value as well. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45511) SPIP: State Data Source - Reader
Jungtaek Lim created SPARK-45511: Summary: SPIP: State Data Source - Reader Key: SPARK-45511 URL: https://issues.apache.org/jira/browse/SPARK-45511 Project: Spark Issue Type: New Feature Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Jungtaek Lim State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores. Specific to the ability to read the state, the lack of feature brings up various limitations like following: * Users are unable to see the content in the state store, leading to inability to debug. * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query. Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query. SPIP: [https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44262) JdbcUtils hardcodes some SQL statements
[ https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44262: --- Labels: pull-request-available (was: ) > JdbcUtils hardcodes some SQL statements > --- > > Key: SPARK-44262 > URL: https://issues.apache.org/jira/browse/SPARK-44262 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Florent BIVILLE >Priority: Minor > Labels: pull-request-available > > I am currently investigating an integration with the [Neo4j JBDC > driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud > vendor SDK. > > This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data. > While {{JdbcUtils}} partly delegates to > \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are > hardcoded to SQL, see: > * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}} > * > {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}} > > This works fine for relational databases but breaks for NOSQL stores that do > not support SQL translation (like Neo4j). > Is there a plan to augment the {{JdbcDialect}} surface so that it is also > responsible for these currently-hardcoded queries? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org