from:"Hyukjin Kwon \(Jira\)"

[jira] [Created] (SPARK-46666) Make lxml as an optional testing dependency in test_session

2024-01-10 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-4:


 Summary: Make lxml as an optional testing dependency in 
test_session
 Key: SPARK-4
 URL: https://issues.apache.org/jira/browse/SPARK-4
 Project: Spark
  Issue Type: Test
  Components: PySpark, Tests
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


{code}
Traceback (most recent call last):
  File "", line 198, in _run_module_as_main
  File "", line 88, in _run_code
  File "/__w/spark/spark/python/pyspark/sql/tests/test_session.py", line 22, in 

from lxml import etree
ModuleNotFoundError: No module named 'lxml'
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46651) Split `FrameTakeTests`

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46651:


Assignee: Ruifeng Zheng

> Split `FrameTakeTests`
> --
>
> Key: SPARK-46651
> URL: https://issues.apache.org/jira/browse/SPARK-46651
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46651) Split `FrameTakeTests`

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46651.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44656
[https://github.com/apache/spark/pull/44656]

> Split `FrameTakeTests`
> --
>
> Key: SPARK-46651
> URL: https://issues.apache.org/jira/browse/SPARK-46651
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46649) Run PyPy 3 and Python 3.10 tests independently

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46649:


Assignee: Hyukjin Kwon

> Run PyPy 3 and Python 3.10 tests independently
> --
>
> Key: SPARK-46649
> URL: https://issues.apache.org/jira/browse/SPARK-46649
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
>
> https://github.com/apache/spark/actions/runs/7462843546/job/20306241275
> Seems like it terminates in the middle because of OOM. we should split



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46645) Exclude unittest-xml-reporting in Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46645:


Assignee: Hyukjin Kwon

> Exclude unittest-xml-reporting in Python 3.12 image
> ---
>
> Key: SPARK-46645
> URL: https://issues.apache.org/jira/browse/SPARK-46645
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> unittest-xml-reporting seems not supporting, and this seems hiding the real 
> error:
> {code}
>   File "/__w/spark/spark/python/pyspark/streaming/tests/test_kinesis.py", 
> line 118, in 
> unittest.main(testRunner=testRunner, verbosity=2)
>   File "/usr/lib/python3.12/unittest/main.py", line 105, in __init__
> self.runTests()
>   File "/usr/lib/python3.12/unittest/main.py", line 281, in runTests
> self.result = testRunner.run(self.test)
>   ^
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/runner.py", line 
> 67, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/case.py", line 692, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/case.py", line 662, in run
> result.stopTest(self)
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 327, in stopTest
> self.callback()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 235, in callback
> test_info.test_finished()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 180, in test_finished
> self.test_result.stop_time - self.test_result.start_time
>  ^^^
> AttributeError: '_XMLTestResult' object has no attribute 'start_time'. Did 
> you mean: 'stop_time'?
> {code}
> This is optional dependency in testing so we can exclude this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46649) Run PyPy 3 and Python 3.10 tests independently

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46649.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44655
[https://github.com/apache/spark/pull/44655]

> Run PyPy 3 and Python 3.10 tests independently
> --
>
> Key: SPARK-46649
> URL: https://issues.apache.org/jira/browse/SPARK-46649
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/apache/spark/actions/runs/7462843546/job/20306241275
> Seems like it terminates in the middle because of OOM. we should split



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46645) Exclude unittest-xml-reporting in Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46645.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44652
[https://github.com/apache/spark/pull/44652]

> Exclude unittest-xml-reporting in Python 3.12 image
> ---
>
> Key: SPARK-46645
> URL: https://issues.apache.org/jira/browse/SPARK-46645
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> unittest-xml-reporting seems not supporting, and this seems hiding the real 
> error:
> {code}
>   File "/__w/spark/spark/python/pyspark/streaming/tests/test_kinesis.py", 
> line 118, in 
> unittest.main(testRunner=testRunner, verbosity=2)
>   File "/usr/lib/python3.12/unittest/main.py", line 105, in __init__
> self.runTests()
>   File "/usr/lib/python3.12/unittest/main.py", line 281, in runTests
> self.result = testRunner.run(self.test)
>   ^
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/runner.py", line 
> 67, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/case.py", line 692, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/case.py", line 662, in run
> result.stopTest(self)
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 327, in stopTest
> self.callback()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 235, in callback
> test_info.test_finished()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 180, in test_finished
> self.test_result.stop_time - self.test_result.start_time
>  ^^^
> AttributeError: '_XMLTestResult' object has no attribute 'start_time'. Did 
> you mean: 'stop_time'?
> {code}
> This is optional dependency in testing so we can exclude this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46649) Run PyPy 3 and Python 3.10 tests independently

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46649:
-
Description: 
https://github.com/apache/spark/actions/runs/7462843546/job/20306241275

Seems like it terminates in the middle because of OOM. we should split

> Run PyPy 3 and Python 3.10 tests independently
> --
>
> Key: SPARK-46649
> URL: https://issues.apache.org/jira/browse/SPARK-46649
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Minor
>
> https://github.com/apache/spark/actions/runs/7462843546/job/20306241275
> Seems like it terminates in the middle because of OOM. we should split



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46649) Run PyPy 3 and Python 3.10 tests independently

2024-01-09 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46649:


 Summary: Run PyPy 3 and Python 3.10 tests independently
 Key: SPARK-46649
 URL: https://issues.apache.org/jira/browse/SPARK-46649
 Project: Spark
  Issue Type: Test
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46536) Support GROUP BY calendar_interval_type

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46536.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44538
[https://github.com/apache/spark/pull/44538]

> Support GROUP BY calendar_interval_type
> ---
>
> Key: SPARK-46536
> URL: https://issues.apache.org/jira/browse/SPARK-46536
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently, Spark GROUP BY only allows orderable data types, otherwise the 
> plan analysis fails: 
> [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala#L197-L203]
> However, this is too strict as GROUP BY only cares about equality, not 
> ordering. The CalendarInterval type is not orderable (1 month and 30 days, we 
> don't know which one is larger), but has well-defined equality. In fact, we 
> already support `SELECT DISTINCT calendar_interval_type` in some cases (when 
> hash aggregate is picked by the planner).
> The proposal here is to officially support calendar interval type in GROUP 
> BY. We should relax the check inside `CheckAnalysis`, and make 
> `CalendarInterval` implements `Comparable` using natural ordering (compare 
> months first, then days, then seconds), and test with both hash aggregate and 
> sort aggregate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46647) Add unittest-xml-reporting into Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46647:


 Summary: Add unittest-xml-reporting into Python 3.12 image
 Key: SPARK-46647
 URL: https://issues.apache.org/jira/browse/SPARK-46647
 Project: Spark
  Issue Type: Test
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


unittest-xml-reporting seems not supporting Python 3.12. We should add it back 
once it supports it, see also SPARK-46645



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46645) Exclude unittest-xml-reporting in Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46645:


 Summary: Exclude unittest-xml-reporting in Python 3.12 image
 Key: SPARK-46645
 URL: https://issues.apache.org/jira/browse/SPARK-46645
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


unittest-xml-reporting seems not supporting, and this seems hiding the real 
error:

{code}
  File "/__w/spark/spark/python/pyspark/streaming/tests/test_kinesis.py", line 
118, in 
unittest.main(testRunner=testRunner, verbosity=2)
  File "/usr/lib/python3.12/unittest/main.py", line 105, in __init__
self.runTests()
  File "/usr/lib/python3.12/unittest/main.py", line 281, in runTests
self.result = testRunner.run(self.test)
  ^
  File "/usr/local/lib/python3.12/dist-packages/xmlrunner/runner.py", line 67, 
in run
test(result)
  File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
return self.run(*args, **kwds)
   ^^^
  File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
test(result)
  File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
return self.run(*args, **kwds)
   ^^^
  File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
test(result)
  File "/usr/lib/python3.12/unittest/case.py", line 692, in __call__
return self.run(*args, **kwds)
   ^^^
  File "/usr/lib/python3.12/unittest/case.py", line 662, in run
result.stopTest(self)
  File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 327, 
in stopTest
self.callback()
  File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 235, 
in callback
test_info.test_finished()
  File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 180, 
in test_finished
self.test_result.stop_time - self.test_result.start_time
 ^^^
AttributeError: '_XMLTestResult' object has no attribute 'start_time'. Did you 
mean: 'stop_time'?

{code}

This is optional dependency in testing so we can exclude this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46645) Exclude unittest-xml-reporting in Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46645:
-
Issue Type: Test  (was: Improvement)

> Exclude unittest-xml-reporting in Python 3.12 image
> ---
>
> Key: SPARK-46645
> URL: https://issues.apache.org/jira/browse/SPARK-46645
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> unittest-xml-reporting seems not supporting, and this seems hiding the real 
> error:
> {code}
>   File "/__w/spark/spark/python/pyspark/streaming/tests/test_kinesis.py", 
> line 118, in 
> unittest.main(testRunner=testRunner, verbosity=2)
>   File "/usr/lib/python3.12/unittest/main.py", line 105, in __init__
> self.runTests()
>   File "/usr/lib/python3.12/unittest/main.py", line 281, in runTests
> self.result = testRunner.run(self.test)
>   ^
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/runner.py", line 
> 67, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/case.py", line 692, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/case.py", line 662, in run
> result.stopTest(self)
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 327, in stopTest
> self.callback()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 235, in callback
> test_info.test_finished()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 180, in test_finished
> self.test_result.stop_time - self.test_result.start_time
>  ^^^
> AttributeError: '_XMLTestResult' object has no attribute 'start_time'. Did 
> you mean: 'stop_time'?
> {code}
> This is optional dependency in testing so we can exclude this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46645) Exclude unittest-xml-reporting in Python 3.12 image

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46645:
-
Issue Type: Improvement  (was: Bug)

> Exclude unittest-xml-reporting in Python 3.12 image
> ---
>
> Key: SPARK-46645
> URL: https://issues.apache.org/jira/browse/SPARK-46645
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> unittest-xml-reporting seems not supporting, and this seems hiding the real 
> error:
> {code}
>   File "/__w/spark/spark/python/pyspark/streaming/tests/test_kinesis.py", 
> line 118, in 
> unittest.main(testRunner=testRunner, verbosity=2)
>   File "/usr/lib/python3.12/unittest/main.py", line 105, in __init__
> self.runTests()
>   File "/usr/lib/python3.12/unittest/main.py", line 281, in runTests
> self.result = testRunner.run(self.test)
>   ^
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/runner.py", line 
> 67, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/suite.py", line 84, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/suite.py", line 122, in run
> test(result)
>   File "/usr/lib/python3.12/unittest/case.py", line 692, in __call__
> return self.run(*args, **kwds)
>^^^
>   File "/usr/lib/python3.12/unittest/case.py", line 662, in run
> result.stopTest(self)
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 327, in stopTest
> self.callback()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 235, in callback
> test_info.test_finished()
>   File "/usr/local/lib/python3.12/dist-packages/xmlrunner/result.py", line 
> 180, in test_finished
> self.test_result.stop_time - self.test_result.start_time
>  ^^^
> AttributeError: '_XMLTestResult' object has no attribute 'start_time'. Did 
> you mean: 'stop_time'?
> {code}
> This is optional dependency in testing so we can exclude this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37039) np.nan series.astype(bool) should be True

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-37039:


Assignee: Haejoon Lee

> np.nan series.astype(bool) should be True
> -
>
> Key: SPARK-37039
> URL: https://issues.apache.org/jira/browse/SPARK-37039
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> np.nan series.astype(bool) should be True, rather than Fasle:
> https://github.com/apache/spark/blob/46bcef7472edd40c23afd9ac74cffe13c6a608ad/python/pyspark/pandas/data_type_ops/base.py#L147
> >>> pd.Series([1, 2, np.nan], dtype=float).astype(bool)
> >>> pd.Series([1, 2, np.nan], dtype=str).astype(bool)
> >>> pd.Series([datetime.date(1994, 1, 31), datetime.date(1994, 2, 1), np.nan])
> 0 True
> 1 True
> 2 True
> dtype: bool
> But in pyspark, it is:
> 0 True
> 1 True
> 2 False
> dtype: bool



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37039) np.nan series.astype(bool) should be True

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-37039.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44570
[https://github.com/apache/spark/pull/44570]

> np.nan series.astype(bool) should be True
> -
>
> Key: SPARK-37039
> URL: https://issues.apache.org/jira/browse/SPARK-37039
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> np.nan series.astype(bool) should be True, rather than Fasle:
> https://github.com/apache/spark/blob/46bcef7472edd40c23afd9ac74cffe13c6a608ad/python/pyspark/pandas/data_type_ops/base.py#L147
> >>> pd.Series([1, 2, np.nan], dtype=float).astype(bool)
> >>> pd.Series([1, 2, np.nan], dtype=str).astype(bool)
> >>> pd.Series([datetime.date(1994, 1, 31), datetime.date(1994, 2, 1), np.nan])
> 0 True
> 1 True
> 2 True
> dtype: bool
> But in pyspark, it is:
> 0 True
> 1 True
> 2 False
> dtype: bool



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46633) Reading a non-empty Avro file with empty blocks returns 0 records

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46633.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44635
[https://github.com/apache/spark/pull/44635]

> Reading a non-empty Avro file with empty blocks returns 0 records
> -
>
> Key: SPARK-46633
> URL: https://issues.apache.org/jira/browse/SPARK-46633
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Ivan Sadikov
>Assignee: Ivan Sadikov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When an Avro file contains empty blocks, Spark returns 0 records while 
> "fastavro" and "avro-python-3" both read the file correctly and return 
> records.
>  
> This is due to the way Spark handles empty blocks (or does not handle). Call 
> to `hasNext` loads the next block and if that block is empty, it returns 
> false. But instead of exiting the loop, we need to probe the next block until 
> sync point is reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46633) Reading a non-empty Avro file with empty blocks returns 0 records

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46633:


Assignee: Ivan Sadikov

> Reading a non-empty Avro file with empty blocks returns 0 records
> -
>
> Key: SPARK-46633
> URL: https://issues.apache.org/jira/browse/SPARK-46633
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Ivan Sadikov
>Assignee: Ivan Sadikov
>Priority: Major
>  Labels: pull-request-available
>
> When an Avro file contains empty blocks, Spark returns 0 records while 
> "fastavro" and "avro-python-3" both read the file correctly and return 
> records.
>  
> This is due to the way Spark handles empty blocks (or does not handle). Call 
> to `hasNext` loads the next block and if that block is empty, it returns 
> false. But instead of exiting the loop, we need to probe the next block until 
> sync point is reached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46437) Enable conditional includes in Jekyll documentation

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46437:


Assignee: Nicholas Chammas

> Enable conditional includes in Jekyll documentation
> ---
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46437) Enable conditional includes in Jekyll documentation

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46437.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44630
[https://github.com/apache/spark/pull/44630]

> Enable conditional includes in Jekyll documentation
> ---
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46593) Refactor `data_type_ops` tests

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46593.
--
Resolution: Fixed

Issue resolved by pull request 44637
[https://github.com/apache/spark/pull/44637]

> Refactor `data_type_ops` tests
> --
>
> Key: SPARK-46593
> URL: https://issues.apache.org/jira/browse/SPARK-46593
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46637) Enhancing the Visual Appeal of Spark doc website

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46637.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44642
[https://github.com/apache/spark/pull/44642]

> Enhancing the Visual Appeal of Spark doc website
> 
>
> Key: SPARK-46637
> URL: https://issues.apache.org/jira/browse/SPARK-46637
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0, 3.5.1
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46630) XML: Validate XML element name on write

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46630.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44634
[https://github.com/apache/spark/pull/44634]

> XML: Validate XML element name on write
> ---
>
> Key: SPARK-46630
> URL: https://issues.apache.org/jira/browse/SPARK-46630
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Assignee: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46630) XML: Validate XML element name on write

2024-01-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46630:


Assignee: Sandip Agarwala

> XML: Validate XML element name on write
> ---
>
> Key: SPARK-46630
> URL: https://issues.apache.org/jira/browse/SPARK-46630
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Assignee: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46626) Bump jekyll version to support Ruby 3.3

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46626:


Assignee: Nicholas Chammas

> Bump jekyll version to support Ruby 3.3
> ---
>
> Key: SPARK-46626
> URL: https://issues.apache.org/jira/browse/SPARK-46626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46626) Bump jekyll version to support Ruby 3.3

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46626.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44628
[https://github.com/apache/spark/pull/44628]

> Bump jekyll version to support Ruby 3.3
> ---
>
> Key: SPARK-46626
> URL: https://issues.apache.org/jira/browse/SPARK-46626
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46621) Address null from Exception.getMessage in Py4J captured exception

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46621:
-
Priority: Minor  (was: Major)

> Address null from Exception.getMessage in Py4J captured exception
> -
>
> Key: SPARK-46621
> URL: https://issues.apache.org/jira/browse/SPARK-46621
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
>
> If JVM throws an exception without a message, the message becomes null and 
> returns:
> {code}
>   File "/.../pyspark/errors/exceptions/captured.py", line 88, in __str__
> desc = desc + "\n\nJVM stacktrace:\n%s" % self._stackTrace
> TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46621) Address null from Exception.getMessage in Py4J captured exception

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46621:
-
Description: 
If JVM throws an exception without a message, the message becomes null and 
returns:

{code}
  File "/.../pyspark/errors/exceptions/captured.py", line 88, in __str__
desc = desc + "\n\nJVM stacktrace:\n%s" % self._stackTrace
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
{code}

  was:
If JVM throws an exception without a message, the message becomes null and 
returns:

{code}
pyspark.errors.exceptions.captured.UnsupportedOperationException: 
JVM stacktrace:
java.lang.UnsupportedOperationException
at 
com.databricks.sql.acl.PlaceholderScimClient.getUserInfo(MockScimClient.scala:49)
at 
com.databricks.sql.acl.InlineUserInfoExpressions.userInfo$lzycompute$1(InlineUserInfoExpressions.scala:73)
at 
com.databricks.sql.acl.InlineUserInfoExpressions.com$databricks$sql$acl$InlineUserInfoExpressions$$userInfo$1(InlineUserInfoExpressions.scala:73)
at 
com.databricks.sql.acl.InlineUserInfoExpressions$$anonfun$rewrite$2.$anonfun$applyOrElse$2(InlineUserInfoExpressions.scala:98)
at scala.Option.getOrElse(Option.scala:189)
at 
com.databricks.sql.acl.InlineUserInfoExpressions$$anonfun$rewrite$2.applyOrElse(InlineUserInfoExpressions.scala:98)
at 
com.databricks.sql.acl.InlineUserInfoExpressions$$anonfun$rewrite$2.applyOrElse(InlineUserInfoExpressions.scala:84)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:473)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:473)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$3(TreeNode.scala:478)
at 
org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1277)
at 
org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1276)
at 
org.apache.spark.sql.catalyst.expressions.UnaryExpression.mapChildren(Expression.scala:656)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:478)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDownWithPruning$1(QueryPlan.scala:174)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:215)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:215)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:226)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:231)
at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
{code}


> Address null from Exception.getMessage in Py4J captured exception
> -
>
> Key: SPARK-46621
> URL: https://issues.apache.org/jira/browse/SPARK-46621
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> If JVM throws an exception without a message, the message becomes null and 
> returns:
> {code}
>   File "/.../pyspark/errors/exceptions/captured.py", line 88, in __str__
> desc = desc + "\n\nJVM stacktrace:\n%s" % self._stackTrace
> TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46601) Fix log error in handleStatusMessage

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46601:


Assignee: qingbo jiao

> Fix log error in handleStatusMessage
> 
>
> Key: SPARK-46601
> URL: https://issues.apache.org/jira/browse/SPARK-46601
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: qingbo jiao
>Assignee: qingbo jiao
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46601) Fix log error in handleStatusMessage

2024-01-08 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46601.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44606
[https://github.com/apache/spark/pull/44606]

> Fix log error in handleStatusMessage
> 
>
> Key: SPARK-46601
> URL: https://issues.apache.org/jira/browse/SPARK-46601
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: qingbo jiao
>Assignee: qingbo jiao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46522) Block Python data source registration with name conflicts

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46522.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44507
[https://github.com/apache/spark/pull/44507]

> Block Python data source registration with name conflicts
> -
>
> Key: SPARK-46522
> URL: https://issues.apache.org/jira/browse/SPARK-46522
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Users should not be allowed to register Python data sources with names that 
> are the same as builtin or existing Scala/Java data sources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46522) Block Python data source registration with name conflicts

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46522:


Assignee: Allison Wang

> Block Python data source registration with name conflicts
> -
>
> Key: SPARK-46522
> URL: https://issues.apache.org/jira/browse/SPARK-46522
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Users should not be allowed to register Python data sources with names that 
> are the same as builtin or existing Scala/Java data sources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-46437) Remove unnecessary cruft from SQL built-in functions docs

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reopened SPARK-46437:
--
  Assignee: (was: Nicholas Chammas)

Reverted at 
https://github.com/apache/spark/commit/a88c64e7dbdd813fa0a9df85a0ce9f1db6706ede

> Remove unnecessary cruft from SQL built-in functions docs
> -
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46437) Remove unnecessary cruft from SQL built-in functions docs

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46437:
-
Fix Version/s: (was: 4.0.0)

> Remove unnecessary cruft from SQL built-in functions docs
> -
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46613) Log full exception when failed to lookup Python Data Sources

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46613.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44617
[https://github.com/apache/spark/pull/44617]

> Log full exception when failed to lookup Python Data Sources
> 
>
> Key: SPARK-46613
> URL: https://issues.apache.org/jira/browse/SPARK-46613
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> See https://github.com/apache/spark/pull/44617



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46613) Log full exception when failed to lookup Python Data Sources

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46613:


Assignee: Hyukjin Kwon

> Log full exception when failed to lookup Python Data Sources
> 
>
> Key: SPARK-46613
> URL: https://issues.apache.org/jira/browse/SPARK-46613
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
>
> See https://github.com/apache/spark/pull/44617



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46603) Refine docstring of `parse_url/url_encode/url_decode`

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46603:


Assignee: Yang Jie

> Refine docstring of `parse_url/url_encode/url_decode`
> -
>
> Key: SPARK-46603
> URL: https://issues.apache.org/jira/browse/SPARK-46603
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46603) Refine docstring of `parse_url/url_encode/url_decode`

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46603.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44604
[https://github.com/apache/spark/pull/44604]

> Refine docstring of `parse_url/url_encode/url_decode`
> -
>
> Key: SPARK-46603
> URL: https://issues.apache.org/jira/browse/SPARK-46603
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46606) Refine docstring of `convert_timezone/make_dt_interval/make_interval`

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46606.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44610
[https://github.com/apache/spark/pull/44610]

> Refine docstring of `convert_timezone/make_dt_interval/make_interval`
> -
>
> Key: SPARK-46606
> URL: https://issues.apache.org/jira/browse/SPARK-46606
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46606) Refine docstring of `convert_timezone/make_dt_interval/make_interval`

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46606:


Assignee: BingKun Pan

> Refine docstring of `convert_timezone/make_dt_interval/make_interval`
> -
>
> Key: SPARK-46606
> URL: https://issues.apache.org/jira/browse/SPARK-46606
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46613) Log full exception when failed to lookup Python Data Sources

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46613:
-
Description: See https://github.com/apache/spark/pull/44617

> Log full exception when failed to lookup Python Data Sources
> 
>
> Key: SPARK-46613
> URL: https://issues.apache.org/jira/browse/SPARK-46613
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Minor
>  Labels: pull-request-available
>
> See https://github.com/apache/spark/pull/44617



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46613) Log full exception when failed to lookup Python Data Sources

2024-01-07 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46613:


 Summary: Log full exception when failed to lookup Python Data 
Sources
 Key: SPARK-46613
 URL: https://issues.apache.org/jira/browse/SPARK-46613
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark, SQL
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46248) Support ignoreCorruptFiles and ignoreMissingFiles options in XML

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46248:


Assignee: Shujing Yang

> Support ignoreCorruptFiles and ignoreMissingFiles options in XML
> 
>
> Key: SPARK-46248
> URL: https://issues.apache.org/jira/browse/SPARK-46248
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Shujing Yang
>Assignee: Shujing Yang
>Priority: Major
>  Labels: pull-request-available
>
> This PR corrects the handling of corrupt or missing multiline XML files by 
> respecting user-specific options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46607) Check the testing mode

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46607:


Assignee: Ruifeng Zheng

> Check the testing mode
> --
>
> Key: SPARK-46607
> URL: https://issues.apache.org/jira/browse/SPARK-46607
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46248) Support ignoreCorruptFiles and ignoreMissingFiles options in XML

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46248.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44163
[https://github.com/apache/spark/pull/44163]

> Support ignoreCorruptFiles and ignoreMissingFiles options in XML
> 
>
> Key: SPARK-46248
> URL: https://issues.apache.org/jira/browse/SPARK-46248
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Shujing Yang
>Assignee: Shujing Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This PR corrects the handling of corrupt or missing multiline XML files by 
> respecting user-specific options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46607) Check the testing mode

2024-01-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46607.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44611
[https://github.com/apache/spark/pull/44611]

> Check the testing mode
> --
>
> Key: SPARK-46607
> URL: https://issues.apache.org/jira/browse/SPARK-46607
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46599) XML: Use TypeCoercion.findTightestCommonType for compatibility check

2024-01-06 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46599.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44601
[https://github.com/apache/spark/pull/44601]

> XML: Use TypeCoercion.findTightestCommonType for compatibility check
> 
>
> Key: SPARK-46599
> URL: https://issues.apache.org/jira/browse/SPARK-46599
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Assignee: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46599) XML: Use TypeCoercion.findTightestCommonType for compatibility check

2024-01-06 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46599:


Assignee: Sandip Agarwala

> XML: Use TypeCoercion.findTightestCommonType for compatibility check
> 
>
> Key: SPARK-46599
> URL: https://issues.apache.org/jira/browse/SPARK-46599
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Assignee: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46587) XML: Fix XSD big integer conversion

2024-01-03 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46587.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44587
[https://github.com/apache/spark/pull/44587]

> XML: Fix XSD big integer conversion
> ---
>
> Key: SPARK-46587
> URL: https://issues.apache.org/jira/browse/SPARK-46587
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Sandip Agarwala
>Assignee: Sandip Agarwala
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46582) Upgrade R Tools version from 4.0.2 to 4.3.2 in AppVeyor

2024-01-03 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46582:


 Summary: Upgrade R Tools version from 4.0.2 to 4.3.2 in AppVeyor
 Key: SPARK-46582
 URL: https://issues.apache.org/jira/browse/SPARK-46582
 Project: Spark
  Issue Type: Bug
  Components: Project Infra, R
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


R Tools 4.3.X is for R 4.3.X. We did not upgrade because of the test failure 
previously.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46571) Re-enable TODOs that are resolved from recent Pandas

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46571:


Assignee: Haejoon Lee

> Re-enable TODOs that are resolved from recent Pandas
> 
>
> Key: SPARK-46571
> URL: https://issues.apache.org/jira/browse/SPARK-46571
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> We can uncomments some TODOs that are already resolved from test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46571) Re-enable TODOs that are resolved from recent Pandas

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46571.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44568
[https://github.com/apache/spark/pull/44568]

> Re-enable TODOs that are resolved from recent Pandas
> 
>
> Key: SPARK-46571
> URL: https://issues.apache.org/jira/browse/SPARK-46571
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We can uncomments some TODOs that are already resolved from test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46565) Improve Python data source error classes and messages

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46565:


Assignee: Allison Wang

> Improve Python data source error classes and messages
> -
>
> Key: SPARK-46565
> URL: https://issues.apache.org/jira/browse/SPARK-46565
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46565) Improve Python data source error classes and messages

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46565.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44560
[https://github.com/apache/spark/pull/44560]

> Improve Python data source error classes and messages
> -
>
> Key: SPARK-46565
> URL: https://issues.apache.org/jira/browse/SPARK-46565
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44001) Improve parsing of well known wrapper types

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-44001.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43767
[https://github.com/apache/spark/pull/43767]

> Improve parsing of well known wrapper types
> ---
>
> Key: SPARK-44001
> URL: https://issues.apache.org/jira/browse/SPARK-44001
> Project: Spark
>  Issue Type: Improvement
>  Components: Protobuf
>Affects Versions: 3.4.0
>Reporter: Parth Upadhyay
>Assignee: Parth Upadhyay
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Under `com.google.protobuf`, there are some well known wrapper types for 
> primitives, 
> [namely|https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wrappers.proto],
>  useful for distinguishing between absence of primitive fields and their 
> default values, as well as for use within `google.protobuf.Any` types. These 
> types are:
> {code}
> DoubleValue
> FloatValue
> Int64Value
> Uint64Value
> Int32Value
> Uint32Value
> BoolValue
> StringValue
> BytesValue
> {code}
> Currently, when we deserialize these from a serialized protobuf into a spark 
> struct, we expand them as if they were normal messages. Concretely, if we have
> {code}
> syntax = "proto3";
> import "google/protobuf/wrappers.proto"
> message WktExample {
>   google.protobuf.BoolValue bool_val = 1;
>   google.protobuf.Int32Value int32_val = 2;
> }
> {code}
> And a message like
> {code}
> WktExample(true, 100)
> {code}
> Then the behavior today is to deserialize this as.
> {code}
> {"bool_val": {"value": true}, "int32_val": {"value": 100}}
> {code}
> This is quite difficult to work with and not in the spirit of the wrapper 
> type, so it would be nice to deserialize as
> {code}
> {"bool_val": true, "int32_val": 100}
> {code}
> This is also the behavior by other popular deserialization libraries, 
> including java protobuf util 
> [Jsonformat|https://github.com/protocolbuffers/protobuf/blob/main/java/util/src/main/java/com/google/protobuf/util/JsonFormat.java#L904-L914]
>  and golangs 
> [jsonpb|https://github.com/gogo/protobuf/blob/master/jsonpb/jsonpb.go#L207-L214].
> So for consistency with other libraries and improved usability, I propose we 
> deserialize well known types in this way. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46570) Run Python 3.11 and 3.12 test independently

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46570.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44566
[https://github.com/apache/spark/pull/44566]

> Run Python 3.11 and 3.12 test independently
> ---
>
> Key: SPARK-46570
> URL: https://issues.apache.org/jira/browse/SPARK-46570
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46570) Run Python 3.11 and 3.12 test independently

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46570:


Assignee: Dongjoon Hyun

> Run Python 3.11 and 3.12 test independently
> ---
>
> Key: SPARK-46570
> URL: https://issues.apache.org/jira/browse/SPARK-46570
> Project: Spark
>  Issue Type: Test
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46553) FutureWarning for interpolate with object dtype

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46553.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44550
[https://github.com/apache/spark/pull/44550]

> FutureWarning for interpolate with object dtype
> ---
>
> Key: SPARK-46553
> URL: https://issues.apache.org/jira/browse/SPARK-46553
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> >>> pdf.interpolate()
> :1: FutureWarning: DataFrame.interpolate with object dtype is 
> deprecated and will raise in a future version. Call 
> obj.infer_objects(copy=False) before interpolating instead.
>    A  B
> 0  a  1
> 1  b  2
> 2  c  3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46553) FutureWarning for interpolate with object dtype

2024-01-02 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46553:


Assignee: Haejoon Lee

> FutureWarning for interpolate with object dtype
> ---
>
> Key: SPARK-46553
> URL: https://issues.apache.org/jira/browse/SPARK-46553
> Project: Spark
>  Issue Type: Bug
>  Components: Pandas API on Spark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> >>> pdf.interpolate()
> :1: FutureWarning: DataFrame.interpolate with object dtype is 
> deprecated and will raise in a future version. Call 
> obj.infer_objects(copy=False) before interpolating instead.
>    A  B
> 0  a  1
> 1  b  2
> 2  c  3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46564) Exclude unrelated files via using omit options properly in PySpark coverage report

2024-01-02 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46564:


 Summary: Exclude unrelated files via using omit options properly 
in PySpark coverage report
 Key: SPARK-46564
 URL: https://issues.apache.org/jira/browse/SPARK-46564
 Project: Spark
  Issue Type: Bug
  Components: Project Infra, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


The files are not excluded for some reasons at the PySpark test coverage report 
(https://app.codecov.io/gh/apache/spark)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46557) Refine docstring for DataFrame.schema/explain/printSchema

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46557:


Assignee: Hyukjin Kwon

> Refine docstring for DataFrame.schema/explain/printSchema
> -
>
> Key: SPARK-46557
> URL: https://issues.apache.org/jira/browse/SPARK-46557
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46557) Refine docstring for DataFrame.schema/explain/printSchema

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46557.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44553
[https://github.com/apache/spark/pull/44553]

> Refine docstring for DataFrame.schema/explain/printSchema
> -
>
> Key: SPARK-46557
> URL: https://issues.apache.org/jira/browse/SPARK-46557
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46540) Respect column names when Python data source read function outputs named Row objects

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46540.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44531
[https://github.com/apache/spark/pull/44531]

> Respect column names when Python data source read function outputs named Row 
> objects
> 
>
> Key: SPARK-46540
> URL: https://issues.apache.org/jira/browse/SPARK-46540
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46540) Respect column names when Python data source read function outputs named Row objects

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46540:


Assignee: Allison Wang

> Respect column names when Python data source read function outputs named Row 
> objects
> 
>
> Key: SPARK-46540
> URL: https://issues.apache.org/jira/browse/SPARK-46540
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46556) Refine docstring for DataFrame.createGlobalTempView/createOrReplaceGlobalTempView

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46556.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44552
[https://github.com/apache/spark/pull/44552]

> Refine docstring for 
> DataFrame.createGlobalTempView/createOrReplaceGlobalTempView
> -
>
> Key: SPARK-46556
> URL: https://issues.apache.org/jira/browse/SPARK-46556
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46556) Refine docstring for DataFrame.createGlobalTempView/createOrReplaceGlobalTempView

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46556:


Assignee: Hyukjin Kwon

> Refine docstring for 
> DataFrame.createGlobalTempView/createOrReplaceGlobalTempView
> -
>
> Key: SPARK-46556
> URL: https://issues.apache.org/jira/browse/SPARK-46556
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46555) Refine docstring for DataFrame.createTempView/createOrReplaceTempView

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46555:


Assignee: Hyukjin Kwon

> Refine docstring for DataFrame.createTempView/createOrReplaceTempView
> -
>
> Key: SPARK-46555
> URL: https://issues.apache.org/jira/browse/SPARK-46555
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46555) Refine docstring for DataFrame.createTempView/createOrReplaceTempView

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46555.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44551
[https://github.com/apache/spark/pull/44551]

> Refine docstring for DataFrame.createTempView/createOrReplaceTempView
> -
>
> Key: SPARK-46555
> URL: https://issues.apache.org/jira/browse/SPARK-46555
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46557) Refine docstring for DataFrame.schema/explain/printSchema

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46557:
-
Summary: Refine docstring for DataFrame.schema/explain/printSchema  (was: 
Refine docstring for DataFrame.explain/printSchema)

> Refine docstring for DataFrame.schema/explain/printSchema
> -
>
> Key: SPARK-46557
> URL: https://issues.apache.org/jira/browse/SPARK-46557
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46557) Refine docstring for DataFrame.explain/printSchema

2024-01-01 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46557:


 Summary: Refine docstring for DataFrame.explain/printSchema
 Key: SPARK-46557
 URL: https://issues.apache.org/jira/browse/SPARK-46557
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46556) Refine docstring for DataFrame.createGlobalTempView/createOrReplaceGlobalTempView

2024-01-01 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46556:


 Summary: Refine docstring for 
DataFrame.createGlobalTempView/createOrReplaceGlobalTempView
 Key: SPARK-46556
 URL: https://issues.apache.org/jira/browse/SPARK-46556
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46555) Refine docstring for DataFrame.createTempView/createOrReplaceTempView

2024-01-01 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-46555:
-
Summary: Refine docstring for 
DataFrame.createTempView/createOrReplaceTempView  (was: Refine docstring for 
DataFrame.registerTempTable/createTempView/createOrReplaceTempView)

> Refine docstring for DataFrame.createTempView/createOrReplaceTempView
> -
>
> Key: SPARK-46555
> URL: https://issues.apache.org/jira/browse/SPARK-46555
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46555) Refine docstring for DataFrame.registerTempTable/createTempView/createOrReplaceTempView

2024-01-01 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46555:


 Summary: Refine docstring for 
DataFrame.registerTempTable/createTempView/createOrReplaceTempView
 Key: SPARK-46555
 URL: https://issues.apache.org/jira/browse/SPARK-46555
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45914) Support `commit` and `abort` API for Python data source write

2023-12-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45914.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44497
[https://github.com/apache/spark/pull/44497]

> Support `commit` and `abort` API for Python data source write
> -
>
> Key: SPARK-45914
> URL: https://issues.apache.org/jira/browse/SPARK-45914
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Support `commit` and `abort` API for Python data source write.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45914) Support `commit` and `abort` API for Python data source write

2023-12-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45914:


Assignee: Allison Wang

> Support `commit` and `abort` API for Python data source write
> -
>
> Key: SPARK-45914
> URL: https://issues.apache.org/jira/browse/SPARK-45914
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Support `commit` and `abort` API for Python data source write.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46382) XML: Capture values interspersed between elements

2023-12-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46382:


Assignee: Shujing Yang

> XML: Capture values interspersed between elements
> -
>
> Key: SPARK-46382
> URL: https://issues.apache.org/jira/browse/SPARK-46382
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Shujing Yang
>Assignee: Shujing Yang
>Priority: Major
>  Labels: pull-request-available
>
> In XML, elements typically consist of a name and a value, with the value 
> enclosed between the opening and closing tags. But XML also allows to include 
> arbitrary values interspersed between these elements. To address this, we 
> provide an option named `valueTags`, which is enabled by default, to capture 
> these values. Consider the following example:
> ```
> 
>     1
>   value1
>   
>     value2
>     2
>     value3
>   
> 
> ```
> In this example, ``,``, and `` are named elements with their 
> respective values enclosed within tags. There are arbitrary values value1 
> value2 value3 interspersed between the elements. Please note that there can 
> be multiple occurrences of values in a single element (i.e. there are value2, 
> value3 in the element )
>  
> We should parse the values between tags into the valueTags field. If there 
> are multiple occurrences of value tags, the value tag field will be converted 
> to an array type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46382) XML: Capture values interspersed between elements

2023-12-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46382.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44318
[https://github.com/apache/spark/pull/44318]

> XML: Capture values interspersed between elements
> -
>
> Key: SPARK-46382
> URL: https://issues.apache.org/jira/browse/SPARK-46382
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Shujing Yang
>Assignee: Shujing Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> In XML, elements typically consist of a name and a value, with the value 
> enclosed between the opening and closing tags. But XML also allows to include 
> arbitrary values interspersed between these elements. To address this, we 
> provide an option named `valueTags`, which is enabled by default, to capture 
> these values. Consider the following example:
> ```
> 
>     1
>   value1
>   
>     value2
>     2
>     value3
>   
> 
> ```
> In this example, ``,``, and `` are named elements with their 
> respective values enclosed within tags. There are arbitrary values value1 
> value2 value3 interspersed between the elements. Please note that there can 
> be multiple occurrences of values in a single element (i.e. there are value2, 
> value3 in the element )
>  
> We should parse the values between tags into the valueTags field. If there 
> are multiple occurrences of value tags, the value tag field will be converted 
> to an array type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-46530) Check Python executable when looking up available Data Sources

2023-12-27 Thread Hyukjin Kwon (Jira)

Hyukjin Kwon created SPARK-46530:


 Summary: Check Python executable when looking up available Data 
Sources
 Key: SPARK-46530
 URL: https://issues.apache.org/jira/browse/SPARK-46530
 Project: Spark
  Issue Type: Sub-task
  Components: PySpark, SQL
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


When looking up available Data Sources, we should check if `python` executable 
is available in the system or not. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45917) Statically register Python Data Source

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45917.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44504
[https://github.com/apache/spark/pull/44504]

> Statically register Python Data Source
> --
>
> Key: SPARK-45917
> URL: https://issues.apache.org/jira/browse/SPARK-45917
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> See the inlined comment in {{DataSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45917) Statically register Python Data Source

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45917:


Assignee: Hyukjin Kwon

> Statically register Python Data Source
> --
>
> Key: SPARK-45917
> URL: https://issues.apache.org/jira/browse/SPARK-45917
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> See the inlined comment in {{DataSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46521) Refine docstring of `array_compact/array_distinct/array_remove`

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46521.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44506
[https://github.com/apache/spark/pull/44506]

> Refine docstring of `array_compact/array_distinct/array_remove`
> ---
>
> Key: SPARK-46521
> URL: https://issues.apache.org/jira/browse/SPARK-46521
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46521) Refine docstring of `array_compact/array_distinct/array_remove`

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46521:


Assignee: Yang Jie

> Refine docstring of `array_compact/array_distinct/array_remove`
> ---
>
> Key: SPARK-46521
> URL: https://issues.apache.org/jira/browse/SPARK-46521
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46528) Upgrade zstd-jni to 1.5.5-11

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46528:


Assignee: Dongjoon Hyun

> Upgrade zstd-jni to 1.5.5-11
> 
>
> Key: SPARK-46528
> URL: https://issues.apache.org/jira/browse/SPARK-46528
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46528) Upgrade zstd-jni to 1.5.5-11

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46528.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44515
[https://github.com/apache/spark/pull/44515]

> Upgrade zstd-jni to 1.5.5-11
> 
>
> Key: SPARK-46528
> URL: https://issues.apache.org/jira/browse/SPARK-46528
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46520) Support overwrite mode for Python data source write

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46520:


Assignee: Allison Wang

> Support overwrite mode for Python data source write
> ---
>
> Key: SPARK-46520
> URL: https://issues.apache.org/jira/browse/SPARK-46520
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Support the `overwrite` mode for Python data source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46520) Support overwrite mode for Python data source write

2023-12-27 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46520.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44505
[https://github.com/apache/spark/pull/44505]

> Support overwrite mode for Python data source write
> ---
>
> Key: SPARK-46520
> URL: https://issues.apache.org/jira/browse/SPARK-46520
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Support the `overwrite` mode for Python data source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46513) Move `BasicIndexingTests` to `pyspark.pandas.tests.indexes.*`

2023-12-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46513.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44499
[https://github.com/apache/spark/pull/44499]

> Move `BasicIndexingTests` to `pyspark.pandas.tests.indexes.*`
> -
>
> Key: SPARK-46513
> URL: https://issues.apache.org/jira/browse/SPARK-46513
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46513) Move `BasicIndexingTests` to `pyspark.pandas.tests.indexes.*`

2023-12-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46513:


Assignee: Ruifeng Zheng

> Move `BasicIndexingTests` to `pyspark.pandas.tests.indexes.*`
> -
>
> Key: SPARK-46513
> URL: https://issues.apache.org/jira/browse/SPARK-46513
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45559) Support spark.read.schema(...) for Python data source API

2023-12-25 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45559.
--
Resolution: Invalid

It is supported, and I see some test cases in `PythonDataSourceSuite`

> Support spark.read.schema(...) for Python data source API
> -
>
> Key: SPARK-45559
> URL: https://issues.apache.org/jira/browse/SPARK-45559
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Priority: Major
>
> Support `spark.read.schema(...)` for Python data source read.
> Add test cases where we send the schema as a string instead of StructType, 
> and a positive case as well as a negative case where it doesn't parse 
> successfully with fromDDL?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46503) Move test_default_index to `pyspark.pandas.tests.indexes.*`

2023-12-25 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46503.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44482
[https://github.com/apache/spark/pull/44482]

> Move test_default_index to `pyspark.pandas.tests.indexes.*`
> ---
>
> Key: SPARK-46503
> URL: https://issues.apache.org/jira/browse/SPARK-46503
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46503) Move test_default_index to `pyspark.pandas.tests.indexes.*`

2023-12-25 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46503:


Assignee: Ruifeng Zheng

> Move test_default_index to `pyspark.pandas.tests.indexes.*`
> ---
>
> Key: SPARK-46503
> URL: https://issues.apache.org/jira/browse/SPARK-46503
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46437) Remove unnecessary cruft from SQL built-in functions docs

2023-12-21 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46437.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44393
[https://github.com/apache/spark/pull/44393]

> Remove unnecessary cruft from SQL built-in functions docs
> -
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46437) Remove unnecessary cruft from SQL built-in functions docs

2023-12-21 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46437:


Assignee: Nicholas Chammas

> Remove unnecessary cruft from SQL built-in functions docs
> -
>
> Key: SPARK-46437
> URL: https://issues.apache.org/jira/browse/SPARK-46437
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, SQL
>Affects Versions: 3.5.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46465) Implement Column.isNaN

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46465:


Assignee: Ruifeng Zheng

> Implement Column.isNaN
> --
>
> Key: SPARK-46465
> URL: https://issues.apache.org/jira/browse/SPARK-46465
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46465) Implement Column.isNaN

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46465.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44422
[https://github.com/apache/spark/pull/44422]

> Implement Column.isNaN
> --
>
> Key: SPARK-46465
> URL: https://issues.apache.org/jira/browse/SPARK-46465
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46462:


Assignee: Ruifeng Zheng

> Reorganize `OpsOnDiffFramesGroupByRollingTests`
> ---
>
> Key: SPARK-46462
> URL: https://issues.apache.org/jira/browse/SPARK-46462
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46462) Reorganize `OpsOnDiffFramesGroupByRollingTests`

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46462.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44420
[https://github.com/apache/spark/pull/44420]

> Reorganize `OpsOnDiffFramesGroupByRollingTests`
> ---
>
> Key: SPARK-46462
> URL: https://issues.apache.org/jira/browse/SPARK-46462
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46463.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44421
[https://github.com/apache/spark/pull/44421]

> Reorganize `OpsOnDiffFramesGroupByExpandingTests`
> -
>
> Key: SPARK-46463
> URL: https://issues.apache.org/jira/browse/SPARK-46463
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-46463) Reorganize `OpsOnDiffFramesGroupByExpandingTests`

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46463:


Assignee: Ruifeng Zheng

> Reorganize `OpsOnDiffFramesGroupByExpandingTests`
> -
>
> Key: SPARK-46463
> URL: https://issues.apache.org/jira/browse/SPARK-46463
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-46413) Validate returnType of Arrow Python UDF

2023-12-20 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-46413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46413.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44362
[https://github.com/apache/spark/pull/44362]

> Validate returnType of Arrow Python UDF
> ---
>
> Key: SPARK-46413
> URL: https://issues.apache.org/jira/browse/SPARK-46413
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Xinrong Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Validate returnType of Arrow Python UDF



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 22154 matches

Mail list logo