[jira] [Resolved] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45508.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43344
[https://github.com/apache/spark/pull/43344]

> Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can 
> access cleaner on Java 9+
> --
>
> Key: SPARK-45508
> URL: https://issues.apache.org/jira/browse/SPARK-45508
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We need to add `--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED` to our 
> JVM options so that the code in `org.apache.spark.unsafe.Platform` can access 
> the JDK internal cleaner classes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45508:


Assignee: Josh Rosen

> Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can 
> access cleaner on Java 9+
> --
>
> Key: SPARK-45508
> URL: https://issues.apache.org/jira/browse/SPARK-45508
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
>
> We need to add `--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED` to our 
> JVM options so that the code in `org.apache.spark.unsafe.Platform` can access 
> the JDK internal cleaner classes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45526:


Assignee: Hyukjin Kwon  (was: Allison Wang)

> Refine docstring of `options` for dataframe reader and writer
> -
>
> Key: SPARK-45526
> URL: https://issues.apache.org/jira/browse/SPARK-45526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Refine the docstring of the `options` method of DataFrameReader and 
> DataFrameWriter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45526:


Assignee: Allison Wang

> Refine docstring of `options` for dataframe reader and writer
> -
>
> Key: SPARK-45526
> URL: https://issues.apache.org/jira/browse/SPARK-45526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Refine the docstring of the `options` method of DataFrameReader and 
> DataFrameWriter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45526.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43358
[https://github.com/apache/spark/pull/43358]

> Refine docstring of `options` for dataframe reader and writer
> -
>
> Key: SPARK-45526
> URL: https://issues.apache.org/jira/browse/SPARK-45526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Refine the docstring of the `options` method of DataFrameReader and 
> DataFrameWriter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45530) Use `java.lang.ref.Cleaner` instead of `finalize` for `NioBufferedFileInputStream`

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45530:
---
Labels: pull-request-available  (was: )

> Use `java.lang.ref.Cleaner` instead of `finalize` for 
> `NioBufferedFileInputStream`
> --
>
> Key: SPARK-45530
> URL: https://issues.apache.org/jira/browse/SPARK-45530
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45532) Restore codetabs for the Protobuf Data Source Guide

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45532:
---
Labels: pull-request-available  (was: )

> Restore codetabs for the Protobuf Data Source Guide
> ---
>
> Key: SPARK-45532
> URL: https://issues.apache.org/jira/browse/SPARK-45532
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31357) DataSourceV2: Catalog API for view metadata

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-31357:
---
Labels: SPIP pull-request-available  (was: SPIP)

> DataSourceV2: Catalog API for view metadata
> ---
>
> Key: SPARK-31357
> URL: https://issues.apache.org/jira/browse/SPARK-31357
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: John Zhuge
>Priority: Major
>  Labels: SPIP, pull-request-available
>
> SPARK-24252 added a catalog plugin system and `TableCatalog` API that 
> provided table metadata to Spark. This JIRA adds `ViewCatalog` API for view 
> metadata.
> Details in [SPIP 
> document|https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45534) Use `java.lang.ref.Cleaner` instead of `finalize` for `RemoteBlockPushResolver`

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45534:


 Summary: Use `java.lang.ref.Cleaner` instead of `finalize` for 
`RemoteBlockPushResolver`
 Key: SPARK-45534
 URL: https://issues.apache.org/jira/browse/SPARK-45534
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45533) Use `j.l.r.Cleaner` instead of `finalize` for `RocksDBIterator/LevelDBIterator`

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45533:


 Summary: Use `j.l.r.Cleaner` instead of `finalize` for 
`RocksDBIterator/LevelDBIterator`
 Key: SPARK-45533
 URL: https://issues.apache.org/jira/browse/SPARK-45533
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45532) Restore codetabs for the Protobuf Data Source Guide

2023-10-12 Thread Kent Yao (Jira)
Kent Yao created SPARK-45532:


 Summary: Restore codetabs for the Protobuf Data Source Guide
 Key: SPARK-45532
 URL: https://issues.apache.org/jira/browse/SPARK-45532
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 4.0.0
Reporter: Kent Yao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45524) Initial support for Python data source read API

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45524:
---
Labels: pull-request-available  (was: )

> Initial support for Python data source read API
> ---
>
> Key: SPARK-45524
> URL: https://issues.apache.org/jira/browse/SPARK-45524
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Support Python data source API for reading data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45531) Add more comments and rename some variable name for InjectRuntimeFilter

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45531:
---
Labels: pull-request-available  (was: )

> Add more comments and rename some variable name for InjectRuntimeFilter
> ---
>
> Key: SPARK-45531
> URL: https://issues.apache.org/jira/browse/SPARK-45531
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Assignee: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45531) Add more comments and rename some variable name for InjectRuntimeFilter

2023-10-12 Thread Jiaan Geng (Jira)
Jiaan Geng created SPARK-45531:
--

 Summary: Add more comments and rename some variable name for 
InjectRuntimeFilter
 Key: SPARK-45531
 URL: https://issues.apache.org/jira/browse/SPARK-45531
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Jiaan Geng
Assignee: Jiaan Geng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45530) Use `java.lang.ref.Cleaner` instead of `finalize` for `NioBufferedFileInputStream`

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45530:


 Summary: Use `java.lang.ref.Cleaner` instead of `finalize` for 
`NioBufferedFileInputStream`
 Key: SPARK-45530
 URL: https://issues.apache.org/jira/browse/SPARK-45530
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45498) Followup: Ignore task completion from old stage after retrying indeterminate stages

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-45498.
-
Fix Version/s: 3.5.1
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 43326
[https://github.com/apache/spark/pull/43326]

> Followup: Ignore task completion from old stage after retrying indeterminate 
> stages
> ---
>
> Key: SPARK-45498
> URL: https://issues.apache.org/jira/browse/SPARK-45498
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0, 3.5.1
>Reporter: Mayur Bhosale
>Assignee: Mayur Bhosale
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.1, 4.0.0
>
>
> With SPARK-45182, we added a fix for not letting laggard tasks of the older 
> attempts of the indeterminate stage from marking the partition has completed 
> in map output tracker.
> When a task completes, DAG scheduler also notifies all the tasksets of the 
> stage about that partition being completed. Tasksets would not schedule such 
> task if they are not already scheduled. This is not correct for indeterminate 
> stage, since we want to re-run all the tasks on re-attempt



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45498) Followup: Ignore task completion from old stage after retrying indeterminate stages

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-45498:
---

Assignee: Mayur Bhosale

> Followup: Ignore task completion from old stage after retrying indeterminate 
> stages
> ---
>
> Key: SPARK-45498
> URL: https://issues.apache.org/jira/browse/SPARK-45498
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0, 3.5.1
>Reporter: Mayur Bhosale
>Assignee: Mayur Bhosale
>Priority: Minor
>  Labels: pull-request-available
>
> With SPARK-45182, we added a fix for not letting laggard tasks of the older 
> attempts of the indeterminate stage from marking the partition has completed 
> in map output tracker.
> When a task completes, DAG scheduler also notifies all the tasksets of the 
> stage about that partition being completed. Tasksets would not schedule such 
> task if they are not already scheduled. This is not correct for indeterminate 
> stage, since we want to re-run all the tasks on re-attempt



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45528.
--
Resolution: Duplicate

> Improve the example of DataFrameReader/Writer.options to take a dictionary
> --
>
> Key: SPARK-45528
> URL: https://issues.apache.org/jira/browse/SPARK-45528
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> for example, spark.read.options(**dictionary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45529) Fix flaky KafkaSourceStressSuite

2023-10-12 Thread Deng Ziming (Jira)
Deng Ziming created SPARK-45529:
---

 Summary: Fix flaky KafkaSourceStressSuite
 Key: SPARK-45529
 URL: https://issues.apache.org/jira/browse/SPARK-45529
 Project: Spark
  Issue Type: Improvement
  Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Deng Ziming


test("stress test with multiple topics and partitions") in 
KafkaSourceStressSuite is flaky, when we increase the `iterations` from 50 to 
100, it will always fail locally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45528:
---
Labels: pull-request-available  (was: )

> Improve the example of DataFrameReader/Writer.options to take a dictionary
> --
>
> Key: SPARK-45528
> URL: https://issues.apache.org/jira/browse/SPARK-45528
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> for example, spark.read.options(**dictionary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45526:
---
Labels: pull-request-available  (was: )

> Refine docstring of `options` for dataframe reader and writer
> -
>
> Key: SPARK-45526
> URL: https://issues.apache.org/jira/browse/SPARK-45526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Refine the docstring of the `options` method of DataFrameReader and 
> DataFrameWriter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45528) Improve the example of DataFrameReader/Writer.options to take a dictionary

2023-10-12 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-45528:


 Summary: Improve the example of DataFrameReader/Writer.options to 
take a dictionary
 Key: SPARK-45528
 URL: https://issues.apache.org/jira/browse/SPARK-45528
 Project: Spark
  Issue Type: Documentation
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon



for example, spark.read.options(**dictionary)




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774756#comment-17774756
 ] 

wuyi commented on SPARK-45527:
--

cc [~wbo4958]   [~tgraves] 

> Task fraction resource request is not expected
> --
>
> Key: SPARK-45527
> URL: https://issues.apache.org/jira/browse/SPARK-45527
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.1, 3.3.3, 3.4.1, 3.5.0
>Reporter: wuyi
>Priority: Major
>
>  
> {code:java}
> test("SPARK-XXX") {
>   import org.apache.spark.resource.{ResourceProfileBuilder, 
> TaskResourceRequests}
>   withTempDir { dir =>
> val scriptPath = createTempScriptWithExpectedOutput(dir, 
> "gpuDiscoveryScript",
>   """{"name": "gpu","addresses":["0"]}""")
> val conf = new SparkConf()
>   .setAppName("test")
>   .setMaster("local-cluster[1, 12, 1024]")
>   .set("spark.executor.cores", "12")
> conf.set(TASK_GPU_ID.amountConf, "0.08")
> conf.set(WORKER_GPU_ID.amountConf, "1")
> conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath)
> conf.set(EXECUTOR_GPU_ID.amountConf, "1")
> sc = new SparkContext(conf)
> val rdd = sc.range(0, 100, 1, 4)
> var rdd1 = rdd.repartition(3)
> val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0)
> val rp = new ResourceProfileBuilder().require(treqs).build
> rdd1 = rdd1.withResources(rp)
> assert(rdd1.collect().size === 100)
>   }
> } {code}
> In the above test, the 3 tasks generated by rdd1 are expected to be executed 
> in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 
> 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, 
> those 3 tasks are run in parallel in fact.
> The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. 
> In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't 
> change even if there's a new task resource request (e.g., resource("gpu", 
> 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)
wuyi created SPARK-45527:


 Summary: Task fraction resource request is not expected
 Key: SPARK-45527
 URL: https://issues.apache.org/jira/browse/SPARK-45527
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.5.0, 3.4.1, 3.3.3, 3.2.1
Reporter: wuyi


 
{code:java}
test("SPARK-XXX") {
  import org.apache.spark.resource.{ResourceProfileBuilder, 
TaskResourceRequests}

  withTempDir { dir =>
val scriptPath = createTempScriptWithExpectedOutput(dir, 
"gpuDiscoveryScript",
  """{"name": "gpu","addresses":["0"]}""")

val conf = new SparkConf()
  .setAppName("test")
  .setMaster("local-cluster[1, 12, 1024]")
  .set("spark.executor.cores", "12")
conf.set(TASK_GPU_ID.amountConf, "0.08")
conf.set(WORKER_GPU_ID.amountConf, "1")
conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath)
conf.set(EXECUTOR_GPU_ID.amountConf, "1")
sc = new SparkContext(conf)
val rdd = sc.range(0, 100, 1, 4)
var rdd1 = rdd.repartition(3)
val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0)
val rp = new ResourceProfileBuilder().require(treqs).build
rdd1 = rdd1.withResources(rp)
assert(rdd1.collect().size === 100)
  }
} {code}
In the above test, the 3 tasks generated by rdd1 are expected to be executed in 
sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" 
should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 
tasks are run in parallel in fact.

 

 

The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In 
this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change 
even if there's a new task resource request (e.g., resource("gpu", 1.0) in this 
case). Thus, those 3 tasks are able to be executed in parallel.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45527) Task fraction resource request is not expected

2023-10-12 Thread wuyi (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuyi updated SPARK-45527:
-
Description: 
 
{code:java}
test("SPARK-XXX") {
  import org.apache.spark.resource.{ResourceProfileBuilder, 
TaskResourceRequests}

  withTempDir { dir =>
val scriptPath = createTempScriptWithExpectedOutput(dir, 
"gpuDiscoveryScript",
  """{"name": "gpu","addresses":["0"]}""")

val conf = new SparkConf()
  .setAppName("test")
  .setMaster("local-cluster[1, 12, 1024]")
  .set("spark.executor.cores", "12")
conf.set(TASK_GPU_ID.amountConf, "0.08")
conf.set(WORKER_GPU_ID.amountConf, "1")
conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath)
conf.set(EXECUTOR_GPU_ID.amountConf, "1")
sc = new SparkContext(conf)
val rdd = sc.range(0, 100, 1, 4)
var rdd1 = rdd.repartition(3)
val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0)
val rp = new ResourceProfileBuilder().require(treqs).build
rdd1 = rdd1.withResources(rp)
assert(rdd1.collect().size === 100)
  }
} {code}
In the above test, the 3 tasks generated by rdd1 are expected to be executed in 
sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" 
should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 
tasks are run in parallel in fact.

The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In 
this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change 
even if there's a new task resource request (e.g., resource("gpu", 1.0) in this 
case). Thus, those 3 tasks are able to be executed in parallel.
 

  was:
 
{code:java}
test("SPARK-XXX") {
  import org.apache.spark.resource.{ResourceProfileBuilder, 
TaskResourceRequests}

  withTempDir { dir =>
val scriptPath = createTempScriptWithExpectedOutput(dir, 
"gpuDiscoveryScript",
  """{"name": "gpu","addresses":["0"]}""")

val conf = new SparkConf()
  .setAppName("test")
  .setMaster("local-cluster[1, 12, 1024]")
  .set("spark.executor.cores", "12")
conf.set(TASK_GPU_ID.amountConf, "0.08")
conf.set(WORKER_GPU_ID.amountConf, "1")
conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath)
conf.set(EXECUTOR_GPU_ID.amountConf, "1")
sc = new SparkContext(conf)
val rdd = sc.range(0, 100, 1, 4)
var rdd1 = rdd.repartition(3)
val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0)
val rp = new ResourceProfileBuilder().require(treqs).build
rdd1 = rdd1.withResources(rp)
assert(rdd1.collect().size === 100)
  }
} {code}
In the above test, the 3 tasks generated by rdd1 are expected to be executed in 
sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 1.0)" 
should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, those 3 
tasks are run in parallel in fact.

 

 

The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. In 
this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't change 
even if there's a new task resource request (e.g., resource("gpu", 1.0) in this 
case). Thus, those 3 tasks are able to be executed in parallel.
 


> Task fraction resource request is not expected
> --
>
> Key: SPARK-45527
> URL: https://issues.apache.org/jira/browse/SPARK-45527
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.2.1, 3.3.3, 3.4.1, 3.5.0
>Reporter: wuyi
>Priority: Major
>
>  
> {code:java}
> test("SPARK-XXX") {
>   import org.apache.spark.resource.{ResourceProfileBuilder, 
> TaskResourceRequests}
>   withTempDir { dir =>
> val scriptPath = createTempScriptWithExpectedOutput(dir, 
> "gpuDiscoveryScript",
>   """{"name": "gpu","addresses":["0"]}""")
> val conf = new SparkConf()
>   .setAppName("test")
>   .setMaster("local-cluster[1, 12, 1024]")
>   .set("spark.executor.cores", "12")
> conf.set(TASK_GPU_ID.amountConf, "0.08")
> conf.set(WORKER_GPU_ID.amountConf, "1")
> conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath)
> conf.set(EXECUTOR_GPU_ID.amountConf, "1")
> sc = new SparkContext(conf)
> val rdd = sc.range(0, 100, 1, 4)
> var rdd1 = rdd.repartition(3)
> val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0)
> val rp = new ResourceProfileBuilder().require(treqs).build
> rdd1 = rdd1.withResources(rp)
> assert(rdd1.collect().size === 100)
>   }
> } {code}
> In the above test, the 3 tasks generated by rdd1 are expected to be executed 
> in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", 
> 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, 
> those 3 tasks are run in parallel in fact.
> The root cause is that 

[jira] [Created] (SPARK-45526) Refine docstring of `options` for dataframe reader and writer

2023-10-12 Thread Allison Wang (Jira)
Allison Wang created SPARK-45526:


 Summary: Refine docstring of `options` for dataframe reader and 
writer
 Key: SPARK-45526
 URL: https://issues.apache.org/jira/browse/SPARK-45526
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Allison Wang


Refine the docstring of the `options` method of DataFrameReader and 
DataFrameWriter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45515:
---
Labels: pull-request-available  (was: )

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45515.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43349
[https://github.com/apache/spark/pull/43349]

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45515:


Assignee: Yang Jie

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45525) Initial support for Python data source write API

2023-10-12 Thread Allison Wang (Jira)
Allison Wang created SPARK-45525:


 Summary: Initial support for Python data source write API
 Key: SPARK-45525
 URL: https://issues.apache.org/jira/browse/SPARK-45525
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark
Affects Versions: 4.0.0
Reporter: Allison Wang


Support for Python data source write API



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45524) Initial support for Python data source read API

2023-10-12 Thread Allison Wang (Jira)
Allison Wang created SPARK-45524:


 Summary: Initial support for Python data source read API
 Key: SPARK-45524
 URL: https://issues.apache.org/jira/browse/SPARK-45524
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark
Affects Versions: 4.0.0
Reporter: Allison Wang


Support Python data source API for reading data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler

2023-10-12 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-45521:
-

Assignee: Ruifeng Zheng

> Avoid re-computation of nnz in VectorAssembler
> --
>
> Key: SPARK-45521
> URL: https://issues.apache.org/jira/browse/SPARK-45521
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler

2023-10-12 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-45521.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43353
[https://github.com/apache/spark/pull/43353]

> Avoid re-computation of nnz in VectorAssembler
> --
>
> Key: SPARK-45521
> URL: https://issues.apache.org/jira/browse/SPARK-45521
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45418) Change CURRENT_SCHEMA() column alias to match function name

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-45418.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43235
[https://github.com/apache/spark/pull/43235]

> Change CURRENT_SCHEMA() column alias to match function name
> ---
>
> Key: SPARK-45418
> URL: https://issues.apache.org/jira/browse/SPARK-45418
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Michael Zhang
>Assignee: Michael Zhang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45418) Change CURRENT_SCHEMA() column alias to match function name

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-45418:
---

Assignee: Michael Zhang

> Change CURRENT_SCHEMA() column alias to match function name
> ---
>
> Key: SPARK-45418
> URL: https://issues.apache.org/jira/browse/SPARK-45418
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Michael Zhang
>Assignee: Michael Zhang
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-34612) Whether to expose outputDeterministicLevel so custom RDDs can set deterministic level

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-34612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-34612:
---
Labels: pull-request-available  (was: )

> Whether to expose outputDeterministicLevel so custom RDDs can set 
> deterministic level
> -
>
> Key: SPARK-34612
> URL: https://issues.apache.org/jira/browse/SPARK-34612
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.2.0
>Reporter: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>
> This ticket is open to track a TODO item in RDD.outputDeterministicLevel.
> We need to decide if we want to expose it so users can set deterministic 
> level to their custom RDDs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45505) Refactor analyzeInPython function to make it reusable

2023-10-12 Thread Takuya Ueshin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-45505.
---
Fix Version/s: 4.0.0
 Assignee: Allison Wang
   Resolution: Fixed

Issue resolved by pull request 43340
https://github.com/apache/spark/pull/43340

> Refactor analyzeInPython function to make it reusable
> -
>
> Key: SPARK-45505
> URL: https://issues.apache.org/jira/browse/SPARK-45505
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Refactor analyzeInPython method in UserDefinedPythonTableFunction object into 
> an abstract class so that it can be reused in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45523) Return useful error message if UDTF returns None for non-nullable column

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45523:
---
Labels: pull-request-available  (was: )

> Return useful error message if UDTF returns None for non-nullable column
> 
>
> Key: SPARK-45523
> URL: https://issues.apache.org/jira/browse/SPARK-45523
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Daniel
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45523) Return useful error message if UDTF returns None for non-nullable column

2023-10-12 Thread Daniel (Jira)
Daniel created SPARK-45523:
--

 Summary: Return useful error message if UDTF returns None for 
non-nullable column
 Key: SPARK-45523
 URL: https://issues.apache.org/jira/browse/SPARK-45523
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Daniel






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45516) Include QueryContext in SparkThrowable proto message

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45516.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43352
[https://github.com/apache/spark/pull/43352]

> Include QueryContext in SparkThrowable proto message
> 
>
> Key: SPARK-45516
> URL: https://issues.apache.org/jira/browse/SPARK-45516
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Yihong He
>Assignee: Yihong He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45516) Include QueryContext in SparkThrowable proto message

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45516:


Assignee: Yihong He

> Include QueryContext in SparkThrowable proto message
> 
>
> Key: SPARK-45516
> URL: https://issues.apache.org/jira/browse/SPARK-45516
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Yihong He
>Assignee: Yihong He
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] (SPARK-44594) Remove redundant method parameter in kafka connector

2023-10-12 Thread Philip Dakin (Jira)


[ https://issues.apache.org/jira/browse/SPARK-44594 ]


Philip Dakin deleted comment on SPARK-44594:
--

was (Author: JIRAUSER302581):
Related PR: https://github.com/apache/spark/pull/42198

> Remove redundant method parameter in kafka connector
> 
>
> Key: SPARK-44594
> URL: https://issues.apache.org/jira/browse/SPARK-44594
> Project: Spark
>  Issue Type: Improvement
>  Components: Input/Output
>Affects Versions: 4.0.0
>Reporter: Min Zhao
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> There are have redundant parameters in 
> org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and 
> org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are 
> not used, remove them to make the code more concise.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44594) Remove redundant method parameter in kafka connector

2023-10-12 Thread Philip Dakin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774695#comment-17774695
 ] 

Philip Dakin commented on SPARK-44594:
--

Related PR: https://github.com/apache/spark/pull/42198

> Remove redundant method parameter in kafka connector
> 
>
> Key: SPARK-44594
> URL: https://issues.apache.org/jira/browse/SPARK-44594
> Project: Spark
>  Issue Type: Improvement
>  Components: Input/Output
>Affects Versions: 4.0.0
>Reporter: Min Zhao
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> There are have redundant parameters in 
> org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and 
> org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are 
> not used, remove them to make the code more concise.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44594) Remove redundant method parameter in kafka connector

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44594:
---
Labels: pull-request-available  (was: )

> Remove redundant method parameter in kafka connector
> 
>
> Key: SPARK-44594
> URL: https://issues.apache.org/jira/browse/SPARK-44594
> Project: Spark
>  Issue Type: Improvement
>  Components: Input/Output
>Affects Versions: 4.0.0
>Reporter: Min Zhao
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> There are have redundant parameters in 
> org.apache.spark.sql.kafka010.KafkaWriter#validateQuery and 
> org.apache.spark.sql.kafka010.KafkaWriter#write, can remove them. They are 
> not used, remove them to make the code more concise.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45506) Support ivy URIs in SparkConnect addArtifact

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45506:
---
Labels: pull-request-available  (was: )

> Support ivy URIs in SparkConnect addArtifact
> 
>
> Key: SPARK-45506
> URL: https://issues.apache.org/jira/browse/SPARK-45506
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Vsevolod Stepanov
>Priority: Major
>  Labels: pull-request-available
>
> Right now Spark Connect's addArtifact API supports only adding .jar & .class 
> files. It would be useful to extend this API to support adding arbitrary 
> Maven artifacts using Ivy



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45512) Fix compilation warnings related to other-nullary-override

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45512:
---
Labels: pull-request-available  (was: )

> Fix compilation warnings related to other-nullary-override
> --
>
> Key: SPARK-45512
> URL: https://issues.apache.org/jira/browse/SPARK-45512
> Project: Spark
>  Issue Type: Sub-task
>  Components: DStreams, Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16:
>  method with a single empty parameter list overrides method hasNext in trait 
> Iterator defined without a parameter list [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.WrappedCloseableIterator
> [error]   override def hasNext(): Boolean = innerIterator.hasNext
> [error]                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator
> [error]   override def hasNext: Boolean = synchronized {
> [error]                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator
> [error]       override def hasNext: Boolean = {
> [error]                    ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18:
>  method without a parameter list overrides method next in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
> [error]     override def next: U = {
> [error]                  ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
> [error]     override def hasNext: Boolean = {
> [error]                  ^
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45501.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43327
[https://github.com/apache/spark/pull/43327]

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45501:


Assignee: Yang Jie

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45502) Upgrade Kafka to 3.6.0

2023-10-12 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45502.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43348
[https://github.com/apache/spark/pull/43348]

> Upgrade Kafka to 3.6.0
> --
>
> Key: SPARK-45502
> URL: https://issues.apache.org/jira/browse/SPARK-45502
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Deng Ziming
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Apache Kafka 3.6.0 is released on Oct 10, 2023.
> - https://downloads.apache.org/kafka/3.6.0/RELEASE_NOTES.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45522) Migrate jetty 9 to jetty 12

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45522:
-
Description: 
Jetty 12 supports JakartaEE 8/JakartaEE 9/JakartaEE 10 simultaneously. But the 
version span is quite large, need to read the documentation in detail, not sure 
if it can be completed within the 4.0 cycle, so it's set to low priority.
 
 
 
 
 

> Migrate jetty 9 to jetty 12
> ---
>
> Key: SPARK-45522
> URL: https://issues.apache.org/jira/browse/SPARK-45522
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> Jetty 12 supports JakartaEE 8/JakartaEE 9/JakartaEE 10 simultaneously. But 
> the version span is quite large, need to read the documentation in detail, 
> not sure if it can be completed within the 4.0 cycle, so it's set to low 
> priority.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45522) Migrate jetty 9 to jetty 12

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45522:
-
Priority: Minor  (was: Major)

> Migrate jetty 9 to jetty 12
> ---
>
> Key: SPARK-45522
> URL: https://issues.apache.org/jira/browse/SPARK-45522
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45132) Fix IDENTIFIER clause for functions

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-45132:
---

Assignee: Serge Rielau

> Fix IDENTIFIER clause for functions
> ---
>
> Key: SPARK-45132
> URL: https://issues.apache.org/jira/browse/SPARK-45132
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Serge Rielau
>Assignee: Serge Rielau
>Priority: Major
>  Labels: pull-request-available
>
> Due to a quirk in the grammar IDENTIFIER('foo')() does not resolve 
> depending on .
> Example:
> SELECT IDENTIFIER('abs')(-1) works, but
> SELECT IDENTIFIER('abs')(c1) FROM VALUES(-1) AS T(c1) does not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45521:
---
Labels: pull-request-available  (was: )

> Avoid re-computation of nnz in VectorAssembler
> --
>
> Key: SPARK-45521
> URL: https://issues.apache.org/jira/browse/SPARK-45521
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45132) Fix IDENTIFIER clause for functions

2023-10-12 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-45132.
-
Fix Version/s: 3.5.1
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 42888
[https://github.com/apache/spark/pull/42888]

> Fix IDENTIFIER clause for functions
> ---
>
> Key: SPARK-45132
> URL: https://issues.apache.org/jira/browse/SPARK-45132
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Serge Rielau
>Assignee: Serge Rielau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.1, 4.0.0
>
>
> Due to a quirk in the grammar IDENTIFIER('foo')() does not resolve 
> depending on .
> Example:
> SELECT IDENTIFIER('abs')(-1) works, but
> SELECT IDENTIFIER('abs')(c1) FROM VALUES(-1) AS T(c1) does not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45522) Migrate jetty 9 to jetty 12

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45522:


 Summary: Migrate jetty 9 to jetty 12
 Key: SPARK-45522
 URL: https://issues.apache.org/jira/browse/SPARK-45522
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45521) Avoid re-computation of nnz in VectorAssembler

2023-10-12 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-45521:
-

 Summary: Avoid re-computation of nnz in VectorAssembler
 Key: SPARK-45521
 URL: https://issues.apache.org/jira/browse/SPARK-45521
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 4.0.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45520) Add property testing for error constructors in scala clients

2023-10-12 Thread Yihong He (Jira)
Yihong He created SPARK-45520:
-

 Summary: Add property testing for error constructors in scala 
clients
 Key: SPARK-45520
 URL: https://issues.apache.org/jira/browse/SPARK-45520
 Project: Spark
  Issue Type: Test
  Components: Connect
Affects Versions: 4.0.0
Reporter: Yihong He






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45519) cleanSource problem on FileStreamSource for Windows env

2023-10-12 Thread Jira
Yunus Emre Gürses created SPARK-45519:
-

 Summary: cleanSource problem on FileStreamSource for Windows env
 Key: SPARK-45519
 URL: https://issues.apache.org/jira/browse/SPARK-45519
 Project: Spark
  Issue Type: Bug
  Components: Structured Streaming
Affects Versions: 3.4.1
Reporter: Yunus Emre Gürses


We are using Spark with Scala in Windows environment. While streaming using 
Spark, I give the *{{cleanSource}}* option as "archive" and the 
*{{sourceArchiveDir}}* option as "archived" as in the code below.
{code:java}
spark.readStream
  .option("cleanSource", "archive")
  .option("sourceArchiveDir", "archived"){code}
When I tried this in a Linux environment, I realized that the problem was with 
the paths. Because when I set archive mode to "delete", it works on both Linux 
and Windows. But for the archive mode, it does not work on Windows. 

The problem is related to appending paths in Windows. There is a method

 
{code:java}
override protected def cleanTask(entry: FileEntry): Unit{code}
in the FileStreamSource.scala file in the 
org.apache.spark.sql.execution.streaming package. On line 569, the 
!fileSystem.rename(curPath, newPath) code supposed to move source file to 
archive folder. However, when I debugged, I noticed that the curPath and 
newPath values were as follows in windows:

 
{code:java}
curPath: 
file:/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv{code}
{code:java}
newPath: 
file:/C:/dev/be/data-integration-suite/archived/C:/dev/be/data-integration-suite/test-data/streaming-folder/patients/patients-success.csv{code}
It seems that absolute path of csv file were appended when creating newPath 
because there are two *C:/dev/be/data-integration-suite* in the newPath. This 
is the reason probably spark archiving does not work. Instead, newPath should 
be: 
file:/C:/dev/be/data-integration-suite/archived/test-data/streaming-folder/patients/patients-success.csv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45518) Error framework support for Python Spark Connect Client

2023-10-12 Thread Yihong He (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yihong He updated SPARK-45518:
--
Description: 
* Define APIs for the error framework, especially where to expose QueryContext
 * Refactor the exception constructors to support error framework parameters
 * Reconstruct exceptions with error framework parameters in 
FetchErrorDetailResponses 

> Error framework support for Python Spark Connect Client
> ---
>
> Key: SPARK-45518
> URL: https://issues.apache.org/jira/browse/SPARK-45518
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Yihong He
>Priority: Major
>
> * Define APIs for the error framework, especially where to expose QueryContext
>  * Refactor the exception constructors to support error framework parameters
>  * Reconstruct exceptions with error framework parameters in 
> FetchErrorDetailResponses 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45517) Expand more exception constructors to support error framework parameters

2023-10-12 Thread Yihong He (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yihong He updated SPARK-45517:
--
Description: 
* SparkNumberFormatException
 * SparkIllegalArgumentException
 * SparkArithmeticException
 * SparkUnsupportedOperationException
 * SparkArrayIndexOutOfBoundsException
 * SparkDateTimeException
 * SparkRuntimeException
 * SparkUpgradeException

  was:
* 
SparkNumberFormatException
 * 
SparkIllegalArgumentException
 * 
SparkArithmeticException
 * 
SparkUnsupportedOperationException
 * 
SparkArrayIndexOutOfBoundsException
 * 
SparkDateTimeException


> Expand more exception constructors to support error framework parameters
> 
>
> Key: SPARK-45517
> URL: https://issues.apache.org/jira/browse/SPARK-45517
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Yihong He
>Priority: Major
>
> * SparkNumberFormatException
>  * SparkIllegalArgumentException
>  * SparkArithmeticException
>  * SparkUnsupportedOperationException
>  * SparkArrayIndexOutOfBoundsException
>  * SparkDateTimeException
>  * SparkRuntimeException
>  * SparkUpgradeException



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45518) Error framework support for Python Spark Connect Client

2023-10-12 Thread Yihong He (Jira)
Yihong He created SPARK-45518:
-

 Summary: Error framework support for Python Spark Connect Client
 Key: SPARK-45518
 URL: https://issues.apache.org/jira/browse/SPARK-45518
 Project: Spark
  Issue Type: New Feature
  Components: Connect
Affects Versions: 4.0.0
Reporter: Yihong He






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45517) Expand more exception constructors to support error framework parameters

2023-10-12 Thread Yihong He (Jira)
Yihong He created SPARK-45517:
-

 Summary: Expand more exception constructors to support error 
framework parameters
 Key: SPARK-45517
 URL: https://issues.apache.org/jira/browse/SPARK-45517
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 4.0.0
Reporter: Yihong He


* 
SparkNumberFormatException
 * 
SparkIllegalArgumentException
 * 
SparkArithmeticException
 * 
SparkUnsupportedOperationException
 * 
SparkArrayIndexOutOfBoundsException
 * 
SparkDateTimeException



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45516) Include QueryContext in SparkThrowable proto message

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45516:
---
Labels: pull-request-available  (was: )

> Include QueryContext in SparkThrowable proto message
> 
>
> Key: SPARK-45516
> URL: https://issues.apache.org/jira/browse/SPARK-45516
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Yihong He
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45433) CSV/JSON schema inference when timestamps do not match specified timestampFormat with only one row on each partition report error

2023-10-12 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-45433:
-
Fix Version/s: 3.4.2

> CSV/JSON schema inference when timestamps do not match specified 
> timestampFormat with only one row on each partition report error
> -
>
> Key: SPARK-45433
> URL: https://issues.apache.org/jira/browse/SPARK-45433
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.4.0, 3.5.0
>Reporter: Jia Fan
>Assignee: Jia Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.2, 4.0.0, 3.5.1
>
>
> CSV/JSON schema inference when timestamps do not match specified 
> timestampFormat with `only one row on each partition` report error.
> {code:java}
> //eg
> val csv = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss")
>   .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS())
> csv.show() {code}
> {code:java}
> //error
> Caused by: java.time.format.DateTimeParseException: Text 
> '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 
> 19 {code}
> This bug affect 3.3/3.4/3.5. Unlike 
> https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug 
> but has the same error message



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45516) Include QueryContext in SparkThrowable proto message

2023-10-12 Thread Yihong He (Jira)
Yihong He created SPARK-45516:
-

 Summary: Include QueryContext in SparkThrowable proto message
 Key: SPARK-45516
 URL: https://issues.apache.org/jira/browse/SPARK-45516
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 4.0.0
Reporter: Yihong He






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43664) Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-43664:


Assignee: Haejoon Lee

> Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests
> ---
>
> Key: SPARK-43664
> URL: https://issues.apache.org/jira/browse/SPARK-43664
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Pandas API on Spark
>Affects Versions: 3.5.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> Repro: run `SQLParityTests.test_sql_with_index_col`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43664) Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-43664.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43237
[https://github.com/apache/spark/pull/43237]

> Fix TABLE_OR_VIEW_NOT_FOUND from SQLParityTests
> ---
>
> Key: SPARK-43664
> URL: https://issues.apache.org/jira/browse/SPARK-43664
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Pandas API on Spark
>Affects Versions: 3.5.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Repro: run `SQLParityTests.test_sql_with_index_col`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: (was: Apache Spark)

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: Apache Spark

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45501:
--

Assignee: (was: Apache Spark)

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45501:
--

Assignee: Apache Spark

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>  Labels: pull-request-available
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44649:
--

Assignee: Apache Spark

> Runtime Filter supports passing equivalent creation side expressions
> 
>
> Key: SPARK-44649
> URL: https://issues.apache.org/jira/browse/SPARK-44649
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Assignee: Apache Spark
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> SELECT
>   d_year,
>   i_brand_id,
>   i_class_id,
>   i_category_id,
>   i_manufact_id,
>   cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt,
>   cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt
> FROM catalog_sales
>   JOIN item ON i_item_sk = cs_item_sk
>   JOIN date_dim ON d_date_sk = cs_sold_date_sk
>   LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number
> AND cs_item_sk = cr_item_sk)
> WHERE i_category = 'Books'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45501:
--

Assignee: Apache Spark

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>  Labels: pull-request-available
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44649:
--

Assignee: (was: Apache Spark)

> Runtime Filter supports passing equivalent creation side expressions
> 
>
> Key: SPARK-44649
> URL: https://issues.apache.org/jira/browse/SPARK-44649
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> SELECT
>   d_year,
>   i_brand_id,
>   i_class_id,
>   i_category_id,
>   i_manufact_id,
>   cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt,
>   cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt
> FROM catalog_sales
>   JOIN item ON i_item_sk = cs_item_sk
>   JOIN date_dim ON d_date_sk = cs_sold_date_sk
>   LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number
> AND cs_item_sk = cr_item_sk)
> WHERE i_category = 'Books'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45501) Use pattern matching for type checking and conversion

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45501:
--

Assignee: (was: Apache Spark)

> Use pattern matching for type checking and conversion
> -
>
> Key: SPARK-45501
> URL: https://issues.apache.org/jira/browse/SPARK-45501
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> Refer to [JEP 394|https://openjdk.org/jeps/394]
> Example:
> {code:java}
> if (obj instanceof String) {
>     String str = (String) obj;
>     System.out.println(str);
> } {code}
> Can be replaced with
>  
> {code:java}
> if (obj instanceof String str) {
>     System.out.println(str);
> } {code}
> The new code look more compact



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44649:
--

Assignee: (was: Apache Spark)

> Runtime Filter supports passing equivalent creation side expressions
> 
>
> Key: SPARK-44649
> URL: https://issues.apache.org/jira/browse/SPARK-44649
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> SELECT
>   d_year,
>   i_brand_id,
>   i_class_id,
>   i_category_id,
>   i_manufact_id,
>   cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt,
>   cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt
> FROM catalog_sales
>   JOIN item ON i_item_sk = cs_item_sk
>   JOIN date_dim ON d_date_sk = cs_sold_date_sk
>   LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number
> AND cs_item_sk = cr_item_sk)
> WHERE i_category = 'Books'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44649) Runtime Filter supports passing equivalent creation side expressions

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44649:
--

Assignee: Apache Spark

> Runtime Filter supports passing equivalent creation side expressions
> 
>
> Key: SPARK-44649
> URL: https://issues.apache.org/jira/browse/SPARK-44649
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Assignee: Apache Spark
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> SELECT
>   d_year,
>   i_brand_id,
>   i_class_id,
>   i_category_id,
>   i_manufact_id,
>   cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt,
>   cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt
> FROM catalog_sales
>   JOIN item ON i_item_sk = cs_item_sk
>   JOIN date_dim ON d_date_sk = cs_sold_date_sk
>   LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number
> AND cs_item_sk = cr_item_sk)
> WHERE i_category = 'Books'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45515) Use enhanced `switch` expressions to replace the regular `switch` statement

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45515:
-
Summary: Use enhanced `switch` expressions to replace the regular `switch` 
statement  (was: Use `Switch Expressions` to replace the regular `switch` 
statement)

> Use enhanced `switch` expressions to replace the regular `switch` statement
> ---
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: (was: Apache Spark)

> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: Apache Spark

> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45515:
-
Description: 
refer to [JEP 361|https://openjdk.org/jeps/361] 

 

Example:
{code:java}
double getPrice(String fruit) {
  switch (fruit) {
    case "Apple":
      return 1.0;
    case "Orange":
      return 1.5;
    case "Mango":
      return 2.0;
    default:
      throw new IllegalArgumentException();
   }
 } {code}

Can be changed to 
{code:java}
double getPrice(String fruit) {
    return switch (fruit) {
      case "Apple" -> 1.0;
      case "Orange" -> 1.5;
      case "Mango" -> 2.0;
      default -> throw new IllegalArgumentException();
    };
} {code}
 

  was:
refer to [JEP 361|https://openjdk.org/jeps/361] 

 

Example:
```java
double getPrice(String fruit) {
  // Switch statement can be replaced with enhanced 'switch'
  switch (fruit) {
    case "Apple":
      return 1.0;
    case "Orange":
      return 1.5;
    case "Mango":
      return 2.0;
    default:
      throw new IllegalArgumentException();
   }
 }
```  
Can be changed to 
```java 
  double getPrice(String fruit) {
    return switch (fruit) {
      case "Apple" -> 1.0;
      case "Orange" -> 1.5;
      case "Mango" -> 2.0;
      default -> throw new IllegalArgumentException();
    };
  }
```


> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> {code:java}
> double getPrice(String fruit) {
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  } {code}
> Can be changed to 
> {code:java}
> double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: (was: Apache Spark)

> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> ```java
> double getPrice(String fruit) {
>   // Switch statement can be replaced with enhanced 'switch'
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  }
> ```  
> Can be changed to 
> ```java 
>   double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
>   }
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-45515:
--

Assignee: Apache Spark

> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> ```java
> double getPrice(String fruit) {
>   // Switch statement can be replaced with enhanced 'switch'
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  }
> ```  
> Can be changed to 
> ```java 
>   double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
>   }
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-45515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774382#comment-17774382
 ] 

ASF GitHub Bot commented on SPARK-45515:


User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/43349

> Use `Switch Expressions` to replace the regular `switch` statement
> --
>
> Key: SPARK-45515
> URL: https://issues.apache.org/jira/browse/SPARK-45515
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> refer to [JEP 361|https://openjdk.org/jeps/361] 
>  
> Example:
> ```java
> double getPrice(String fruit) {
>   // Switch statement can be replaced with enhanced 'switch'
>   switch (fruit) {
>     case "Apple":
>       return 1.0;
>     case "Orange":
>       return 1.5;
>     case "Mango":
>       return 2.0;
>     default:
>       throw new IllegalArgumentException();
>    }
>  }
> ```  
> Can be changed to 
> ```java 
>   double getPrice(String fruit) {
>     return switch (fruit) {
>       case "Apple" -> 1.0;
>       case "Orange" -> 1.5;
>       case "Mango" -> 2.0;
>       default -> throw new IllegalArgumentException();
>     };
>   }
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45513) Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2`

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45513:
---
Labels: pull-request-available  (was: )

> Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2`
> ---
>
> Key: SPARK-45513
> URL: https://issues.apache.org/jira/browse/SPARK-45513
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, MLlib, Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Jiaan Geng
>Assignee: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> @scala.deprecated(message = "Use scala.collection.LazyZip2.", since = 
> "2.13.0")
> final class Tuple2Zipped[El1, It1 <: scala.Iterable[El1], El2, It2 <: 
> scala.Iterable[El2]](colls : scala.Tuple2[It1, It2]) extends scala.AnyVal 
> with scala.runtime.ZippedIterable2[El1, El2] {
> @scala.deprecated(message = "Use scala.collection.LazyZip2.", since = 
> "2.13.0")
> object Tuple2Zipped extends scala.AnyRef {
>   final class Ops[T1, T2](x : scala.Tuple2[T1, T2]) extends scala.AnyVal {
> @scala.deprecated(message = "Use xs.lazyZip(yz).map((_, _))", since = 
> "2.13.0")
> def invert[El1, It1[a] <: scala.Iterable[a], El2, It2[a] <: 
> scala.Iterable[a], That](implicit w1 : scala.<:<[T1, It1[El1]], w2 : 
> scala.<:<[T2, It2[El2]], bf : scala.collection.BuildFrom[T1, 
> scala.Tuple2[El1, El2], That]) : That = { /* compiled code */ }
> @scala.deprecated(message = "Use xs.lazyZip(ys)", since = "2.13.0")
> def zipped[El1, It1 <: scala.Iterable[El1], El2, It2 <: 
> scala.Iterable[El2]](implicit w1 : scala.Function1[T1, 
> scala.collection.IterableOps[El1, scala.Iterable, It1] with It1], w2 : 
> scala.Function1[T2, scala.collection.IterableOps[El2, scala.Iterable, It2] 
> with It2]) : scala.runtime.Tuple2Zipped[El1, It1, El2, It2] = { /* compiled 
> code */ }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45515) Use `Switch Expressions` to replace the regular `switch` statement

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45515:


 Summary: Use `Switch Expressions` to replace the regular `switch` 
statement
 Key: SPARK-45515
 URL: https://issues.apache.org/jira/browse/SPARK-45515
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 4.0.0
Reporter: Yang Jie


refer to [JEP 361|https://openjdk.org/jeps/361] 

 

Example:
```java
double getPrice(String fruit) {
  // Switch statement can be replaced with enhanced 'switch'
  switch (fruit) {
    case "Apple":
      return 1.0;
    case "Orange":
      return 1.5;
    case "Mango":
      return 2.0;
    default:
      throw new IllegalArgumentException();
   }
 }
```  
Can be changed to 
```java 
  double getPrice(String fruit) {
    return switch (fruit) {
      case "Apple" -> 1.0;
      case "Orange" -> 1.5;
      case "Mango" -> 2.0;
      default -> throw new IllegalArgumentException();
    };
  }
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44752) XML: Update Spark Docs

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44752:
---
Labels: pull-request-available  (was: )

> XML: Update Spark Docs
> --
>
> Key: SPARK-44752
> URL: https://issues.apache.org/jira/browse/SPARK-44752
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
>
>  [https://spark.apache.org/docs/latest/sql-data-sources.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45510) Replace `scala.collection.generic.Growable` to `scala.collection.mutable.Growable`

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45510.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43347
[https://github.com/apache/spark/pull/43347]

> Replace `scala.collection.generic.Growable` to 
> `scala.collection.mutable.Growable`
> --
>
> Key: SPARK-45510
> URL: https://issues.apache.org/jira/browse/SPARK-45510
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jia Fan
>Assignee: Jia Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Replace `scala.collection.generic.Growable` to 
> `scala.collection.mutable.Growable`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45510) Replace `scala.collection.generic.Growable` to `scala.collection.mutable.Growable`

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45510:


Assignee: Jia Fan

> Replace `scala.collection.generic.Growable` to 
> `scala.collection.mutable.Growable`
> --
>
> Key: SPARK-45510
> URL: https://issues.apache.org/jira/browse/SPARK-45510
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jia Fan
>Assignee: Jia Fan
>Priority: Major
>  Labels: pull-request-available
>
> Replace `scala.collection.generic.Growable` to 
> `scala.collection.mutable.Growable`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45513) Replace `scala.runtime.Tuple2Zipped` to `scala.collection.LazyZip2`

2023-10-12 Thread Jiaan Geng (Jira)
Jiaan Geng created SPARK-45513:
--

 Summary: Replace `scala.runtime.Tuple2Zipped` to 
`scala.collection.LazyZip2`
 Key: SPARK-45513
 URL: https://issues.apache.org/jira/browse/SPARK-45513
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Jiaan Geng
Assignee: Jiaan Geng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45512) Fix compilation warnings related to other-nullary-override #43332

2023-10-12 Thread Yang Jie (Jira)
Yang Jie created SPARK-45512:


 Summary: Fix compilation warnings related to 
other-nullary-override #43332
 Key: SPARK-45512
 URL: https://issues.apache.org/jira/browse/SPARK-45512
 Project: Spark
  Issue Type: Sub-task
  Components: DStreams, Spark Core, SQL
Affects Versions: 4.0.0
Reporter: Yang Jie


{code:java}
[error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16:
 method with a single empty parameter list overrides method hasNext in trait 
Iterator defined without a parameter list [quickfixable]
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, 
site=org.apache.spark.sql.connect.client.WrappedCloseableIterator
[error]   override def hasNext(): Boolean = innerIterator.hasNext
[error]                ^
[error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16:
 method without a parameter list overrides method hasNext in class 
WrappedCloseableIterator defined with a single empty parameter list 
[quickfixable]
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, 
site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator
[error]   override def hasNext: Boolean = synchronized {
[error]                ^
[error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20:
 method without a parameter list overrides method hasNext in class 
WrappedCloseableIterator defined with a single empty parameter list 
[quickfixable]
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, 
site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator
[error]       override def hasNext: Boolean = {
[error]                    ^
[error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18:
 method without a parameter list overrides method next in class 
WrappedCloseableIterator defined with a single empty parameter list 
[quickfixable]
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, 
site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
[error]     override def next: U = {
[error]                  ^
[error] 
/Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18:
 method without a parameter list overrides method hasNext in class 
WrappedCloseableIterator defined with a single empty parameter list 
[quickfixable]
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=other-nullary-override, 
site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
[error]     override def hasNext: Boolean = {
[error]                  ^
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45512) Fix compilation warnings related to other-nullary-override

2023-10-12 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45512:
-
Summary: Fix compilation warnings related to other-nullary-override  (was: 
Fix compilation warnings related to other-nullary-override #43332)

> Fix compilation warnings related to other-nullary-override
> --
>
> Key: SPARK-45512
> URL: https://issues.apache.org/jira/browse/SPARK-45512
> Project: Spark
>  Issue Type: Sub-task
>  Components: DStreams, Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> {code:java}
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CloseableIterator.scala:36:16:
>  method with a single empty parameter list overrides method hasNext in trait 
> Iterator defined without a parameter list [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.WrappedCloseableIterator
> [error]   override def hasNext(): Boolean = innerIterator.hasNext
> [error]                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ExecutePlanResponseReattachableIterator.scala:136:16:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.ExecutePlanResponseReattachableIterator
> [error]   override def hasNext: Boolean = synchronized {
> [error]                ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala:73:20:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcExceptionConverter.convertIterator
> [error]       override def hasNext: Boolean = {
> [error]                    ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:77:18:
>  method without a parameter list overrides method next in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
> [error]     override def next: U = {
> [error]                  ^
> [error] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala:81:18:
>  method without a parameter list overrides method hasNext in class 
> WrappedCloseableIterator defined with a single empty parameter list 
> [quickfixable]
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=other-nullary-override, 
> site=org.apache.spark.sql.connect.client.GrpcRetryHandler.RetryIterator
> [error]     override def hasNext: Boolean = {
> [error]                  ^
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45511) SPIP: State Data Source - Reader

2023-10-12 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated SPARK-45511:
-
Description: 
State Store has been a black box from the introduction of the stateful 
operator. It has been the “internal” data to the streaming query, and Spark 
does not expose the data outside of the streaming query. There is no 
feature/tool for users to read and modify the content of state stores.

Specific to the ability to read the state, the lack of feature brings up 
various limitations like following:
 * Users are unable to see the content in the state store, leading to inability 
to debug.
 * Users have to perform some indirect approach on verifying the content of the 
state store in unit tests. The only option they can take is relying on the 
output of the query.

Given that, we propose to introduce a feature which enables users to read the 
state from the outside of the streaming query.

SPIP: 
[https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing]

 

 

  was:
State Store has been a black box from the introduction of the stateful 
operator. It has been the “internal” data to the streaming query, and Spark 
does not expose the data outside of the streaming query. There is no 
feature/tool for users to read and modify the content of state stores.

Specific to the ability to read the state, the lack of feature brings up 
various limitations like following:
 * Users are unable to see the content in the state store, leading to inability 
to debug.
 * Users have to perform some indirect approach on verifying the content of the 
state store in unit tests. The only option they can take is relying on the 
output of the query.

Given that, we propose to introduce a feature which enables users to read the 
state from the outside of the streaming query.

SPIP: 
[https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing]

 

 


> SPIP: State Data Source - Reader
> 
>
> Key: SPARK-45511
> URL: https://issues.apache.org/jira/browse/SPARK-45511
> Project: Spark
>  Issue Type: New Feature
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Jungtaek Lim
>Priority: Major
>  Labels: SPIP
>
> State Store has been a black box from the introduction of the stateful 
> operator. It has been the “internal” data to the streaming query, and Spark 
> does not expose the data outside of the streaming query. There is no 
> feature/tool for users to read and modify the content of state stores.
> Specific to the ability to read the state, the lack of feature brings up 
> various limitations like following:
>  * Users are unable to see the content in the state store, leading to 
> inability to debug.
>  * Users have to perform some indirect approach on verifying the content of 
> the state store in unit tests. The only option they can take is relying on 
> the output of the query.
> Given that, we propose to introduce a feature which enables users to read the 
> state from the outside of the streaming query.
> SPIP: 
> [https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45488) XML: Add support for value in 'rowTag' element

2023-10-12 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45488.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43319
[https://github.com/apache/spark/pull/43319]

> XML: Add support for value in 'rowTag' element
> --
>
> Key: SPARK-45488
> URL: https://issues.apache.org/jira/browse/SPARK-45488
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> The following XML with rowTag 'book' will yield a schema with just "_id" 
> column and not the value:
>  
> {code:java}
>  Great Book{code}
> Let's parse value as well.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45511) SPIP: State Data Source - Reader

2023-10-12 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-45511:


 Summary: SPIP: State Data Source - Reader
 Key: SPARK-45511
 URL: https://issues.apache.org/jira/browse/SPARK-45511
 Project: Spark
  Issue Type: New Feature
  Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Jungtaek Lim


State Store has been a black box from the introduction of the stateful 
operator. It has been the “internal” data to the streaming query, and Spark 
does not expose the data outside of the streaming query. There is no 
feature/tool for users to read and modify the content of state stores.

Specific to the ability to read the state, the lack of feature brings up 
various limitations like following:
 * Users are unable to see the content in the state store, leading to inability 
to debug.
 * Users have to perform some indirect approach on verifying the content of the 
state store in unit tests. The only option they can take is relying on the 
output of the query.

Given that, we propose to introduce a feature which enables users to read the 
state from the outside of the streaming query.

SPIP: 
[https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing]

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44262:
---
Labels: pull-request-available  (was: )

> JdbcUtils hardcodes some SQL statements
> ---
>
> Key: SPARK-44262
> URL: https://issues.apache.org/jira/browse/SPARK-44262
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Florent BIVILLE
>Priority: Minor
>  Labels: pull-request-available
>
> I am currently investigating an integration with the [Neo4j JBDC 
> driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
> vendor SDK.
>  
> This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.
> While {{JdbcUtils}} partly delegates to 
> \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
> hardcoded to SQL, see:
>  * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
>  * 
> {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}
>  
> This works fine for relational databases but breaks for NOSQL stores that do 
> not support SQL translation (like Neo4j).
> Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
> responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org