[jira] [Commented] (SPARK-41290) Support GENERATED ALWAYS AS in create table

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640184#comment-17640184
 ] 

Apache Spark commented on SPARK-41290:
--

User 'allisonport-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/38823

> Support GENERATED ALWAYS AS in create table
> ---
>
> Key: SPARK-41290
> URL: https://issues.apache.org/jira/browse/SPARK-41290
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Allison Portis
>Priority: Major
>
> Support GENERATED ALWAYS AS syntax for defining generated columns in create 
> table.
> For example,
>  
> {code:java}
> CREATE TABLE default.example (
>     time TIMESTAMP,
>     date DATE GENERATED ALWAYS AS (CAST(time AS DATE))
> )
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41290) Support GENERATED ALWAYS AS in create table

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640183#comment-17640183
 ] 

Apache Spark commented on SPARK-41290:
--

User 'allisonport-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/38823

> Support GENERATED ALWAYS AS in create table
> ---
>
> Key: SPARK-41290
> URL: https://issues.apache.org/jira/browse/SPARK-41290
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Allison Portis
>Priority: Major
>
> Support GENERATED ALWAYS AS syntax for defining generated columns in create 
> table.
> For example,
>  
> {code:java}
> CREATE TABLE default.example (
>     time TIMESTAMP,
>     date DATE GENERATED ALWAYS AS (CAST(time AS DATE))
> )
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41290) Support GENERATED ALWAYS AS in create table

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41290:


Assignee: Apache Spark

> Support GENERATED ALWAYS AS in create table
> ---
>
> Key: SPARK-41290
> URL: https://issues.apache.org/jira/browse/SPARK-41290
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Allison Portis
>Assignee: Apache Spark
>Priority: Major
>
> Support GENERATED ALWAYS AS syntax for defining generated columns in create 
> table.
> For example,
>  
> {code:java}
> CREATE TABLE default.example (
>     time TIMESTAMP,
>     date DATE GENERATED ALWAYS AS (CAST(time AS DATE))
> )
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41290) Support GENERATED ALWAYS AS in create table

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41290:


Assignee: (was: Apache Spark)

> Support GENERATED ALWAYS AS in create table
> ---
>
> Key: SPARK-41290
> URL: https://issues.apache.org/jira/browse/SPARK-41290
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Allison Portis
>Priority: Major
>
> Support GENERATED ALWAYS AS syntax for defining generated columns in create 
> table.
> For example,
>  
> {code:java}
> CREATE TABLE default.example (
>     time TIMESTAMP,
>     date DATE GENERATED ALWAYS AS (CAST(time AS DATE))
> )
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41303) Assign a name to the error class _LEGACY_ERROR_TEMP_2422

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41303:
-
Description: Assign a name to the legacy error class 
_LEGACY_ERROR_TEMP_2422, improve error message and tests.  (was: Assign a name 
to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and 
tests.)

> Assign a name to the error class _LEGACY_ERROR_TEMP_2422
> 
>
> Key: SPARK-41303
> URL: https://issues.apache.org/jira/browse/SPARK-41303
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>
> Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2422, improve 
> error message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41303) Assign a name to the error class _LEGACY_ERROR_TEMP_2422

2022-11-28 Thread Max Gekk (Jira)
Max Gekk created SPARK-41303:


 Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_2422
 Key: SPARK-41303
 URL: https://issues.apache.org/jira/browse/SPARK-41303
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: BingKun Pan


Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error 
message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41302) Assign a name to the error class _LEGACY_ERROR_TEMP_1185

2022-11-28 Thread Max Gekk (Jira)
Max Gekk created SPARK-41302:


 Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1185
 Key: SPARK-41302
 URL: https://issues.apache.org/jira/browse/SPARK-41302
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: BingKun Pan


Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error 
message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41302) Assign a name to the error class _LEGACY_ERROR_TEMP_1185

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41302:
-
Description: Assign a name to the legacy error class 
_LEGACY_ERROR_TEMP_1185, improve error message and tests.  (was: Assign a name 
to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and 
tests.)

> Assign a name to the error class _LEGACY_ERROR_TEMP_1185
> 
>
> Key: SPARK-41302
> URL: https://issues.apache.org/jira/browse/SPARK-41302
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>
> Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1185, improve 
> error message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41180) Assign an error class to "Cannot parse the data type"

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-41180.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38754
[https://github.com/apache/spark/pull/38754]

> Assign an error class to "Cannot parse the data type"
> -
>
> Key: SPARK-41180
> URL: https://issues.apache.org/jira/browse/SPARK-41180
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Max Gekk
>Assignee: Yang Jie
>Priority: Major
> Fix For: 3.4.0
>
>
> The code below shows the issue:
> {code}
> > select from_csv('1', 'a InvalidType');
> org.apache.spark.sql.AnalysisException
> {
>   "errorClass" : "LEGACY",
>   "messageParameters" : {
> "message" : "Cannot parse the data type: \n[PARSE_SYNTAX_ERROR] Syntax 
> error at or near 'InvalidType': extra input 'InvalidType'(line 1, pos 
> 2)\n\n== SQL ==\na InvalidType\n--^^^\n\nFailed fallback parsing: \nDataType 
> invalidtype is not supported.(line 1, pos 2)\n\n== SQL ==\na 
> InvalidType\n--^^^\n; line 1 pos 7"
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41180) Assign an error class to "Cannot parse the data type"

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-41180:


Assignee: Yang Jie

> Assign an error class to "Cannot parse the data type"
> -
>
> Key: SPARK-41180
> URL: https://issues.apache.org/jira/browse/SPARK-41180
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Max Gekk
>Assignee: Yang Jie
>Priority: Major
>
> The code below shows the issue:
> {code}
> > select from_csv('1', 'a InvalidType');
> org.apache.spark.sql.AnalysisException
> {
>   "errorClass" : "LEGACY",
>   "messageParameters" : {
> "message" : "Cannot parse the data type: \n[PARSE_SYNTAX_ERROR] Syntax 
> error at or near 'InvalidType': extra input 'InvalidType'(line 1, pos 
> 2)\n\n== SQL ==\na InvalidType\n--^^^\n\nFailed fallback parsing: \nDataType 
> invalidtype is not supported.(line 1, pos 2)\n\n== SQL ==\na 
> InvalidType\n--^^^\n; line 1 pos 7"
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41301) SparkSession.range should treat end as optional

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41301:


Assignee: Apache Spark

> SparkSession.range should treat end as optional
> ---
>
> Key: SPARK-41301
> URL: https://issues.apache.org/jira/browse/SPARK-41301
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41301) SparkSession.range should treat end as optional

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41301:


Assignee: (was: Apache Spark)

> SparkSession.range should treat end as optional
> ---
>
> Key: SPARK-41301
> URL: https://issues.apache.org/jira/browse/SPARK-41301
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41301) SparkSession.range should treat end as optional

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640107#comment-17640107
 ] 

Apache Spark commented on SPARK-41301:
--

User 'grundprinzip' has created a pull request for this issue:
https://github.com/apache/spark/pull/38822

> SparkSession.range should treat end as optional
> ---
>
> Key: SPARK-41301
> URL: https://issues.apache.org/jira/browse/SPARK-41301
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41301) SparkSession.range should treat end as optional

2022-11-28 Thread Martin Grund (Jira)
Martin Grund created SPARK-41301:


 Summary: SparkSession.range should treat end as optional
 Key: SPARK-41301
 URL: https://issues.apache.org/jira/browse/SPARK-41301
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Martin Grund






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41300) Unset Read.schema is incorrectly read when unset

2022-11-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-41300.
---
Fix Version/s: 3.4.0
 Assignee: Martin Grund
   Resolution: Fixed

> Unset Read.schema is incorrectly read when unset
> 
>
> Key: SPARK-41300
> URL: https://issues.apache.org/jira/browse/SPARK-41300
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Assignee: Martin Grund
>Priority: Major
> Fix For: 3.4.0
>
>
> The following query fails because the schema property is wrongly interpreted.
>  
> ```
> readDf = self.connect.read.format("csv").option("header", 
> True).load(path=tmpPath)
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41254) YarnAllocator.rpIdToYarnResource map is not properly updated

2022-11-28 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-41254:


Assignee: Zhang Liang

> YarnAllocator.rpIdToYarnResource map is not properly updated
> 
>
> Key: SPARK-41254
> URL: https://issues.apache.org/jira/browse/SPARK-41254
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 3.1.0, 3.2.0, 3.3.1
>Reporter: Zhang Liang
>Assignee: Zhang Liang
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Log messages "INFO YarnAllocator: Resource profile 0 doesn't exist, adding 
> it" repeats multiple times in yarn stderr.
> The log should be outputted only once because it happens in 
> _YarnAllocator.createYarnResourceForResourceProfile_ on a default 
> ResourceProfile
> After digging into the code, I found a bug caused by misleading usage of 
> _ConcurrentHashMap.contains_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41254) YarnAllocator.rpIdToYarnResource map is not properly updated

2022-11-28 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-41254.
--
Fix Version/s: 3.2.4
   3.3.2
   3.4.0
   Resolution: Fixed

Issue resolved by pull request 38790
[https://github.com/apache/spark/pull/38790]

> YarnAllocator.rpIdToYarnResource map is not properly updated
> 
>
> Key: SPARK-41254
> URL: https://issues.apache.org/jira/browse/SPARK-41254
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 3.1.0, 3.2.0, 3.3.1
>Reporter: Zhang Liang
>Assignee: Zhang Liang
>Priority: Minor
> Fix For: 3.2.4, 3.3.2, 3.4.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Log messages "INFO YarnAllocator: Resource profile 0 doesn't exist, adding 
> it" repeats multiple times in yarn stderr.
> The log should be outputted only once because it happens in 
> _YarnAllocator.createYarnResourceForResourceProfile_ on a default 
> ResourceProfile
> After digging into the code, I found a bug caused by misleading usage of 
> _ConcurrentHashMap.contains_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41300) Unset Read.schema is incorrectly read when unset

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41300:


Assignee: Apache Spark

> Unset Read.schema is incorrectly read when unset
> 
>
> Key: SPARK-41300
> URL: https://issues.apache.org/jira/browse/SPARK-41300
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Assignee: Apache Spark
>Priority: Major
>
> The following query fails because the schema property is wrongly interpreted.
>  
> ```
> readDf = self.connect.read.format("csv").option("header", 
> True).load(path=tmpPath)
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41300) Unset Read.schema is incorrectly read when unset

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640027#comment-17640027
 ] 

Apache Spark commented on SPARK-41300:
--

User 'grundprinzip' has created a pull request for this issue:
https://github.com/apache/spark/pull/38821

> Unset Read.schema is incorrectly read when unset
> 
>
> Key: SPARK-41300
> URL: https://issues.apache.org/jira/browse/SPARK-41300
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Priority: Major
>
> The following query fails because the schema property is wrongly interpreted.
>  
> ```
> readDf = self.connect.read.format("csv").option("header", 
> True).load(path=tmpPath)
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41300) Unset Read.schema is incorrectly read when unset

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41300:


Assignee: (was: Apache Spark)

> Unset Read.schema is incorrectly read when unset
> 
>
> Key: SPARK-41300
> URL: https://issues.apache.org/jira/browse/SPARK-41300
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Priority: Major
>
> The following query fails because the schema property is wrongly interpreted.
>  
> ```
> readDf = self.connect.read.format("csv").option("header", 
> True).load(path=tmpPath)
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41300) Unset Read.schema is incorrectly read when unset

2022-11-28 Thread Martin Grund (Jira)
Martin Grund created SPARK-41300:


 Summary: Unset Read.schema is incorrectly read when unset
 Key: SPARK-41300
 URL: https://issues.apache.org/jira/browse/SPARK-41300
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Martin Grund


The following query fails because the schema property is wrongly interpreted.

 
```
readDf = self.connect.read.format("csv").option("header", 
True).load(path=tmpPath)
```
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41299) OOM when filter pushdown `last_day` function

2022-11-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-41299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André F. updated SPARK-41299:
-
Description: 
Using the following transformation on Spark 3.3.1:
{code:java}
df.where($"date" === last_day($"date")) {code}
Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
filter dates where they match with the last day of the month of where `date` 
happened.

Executors are dying with the following error:
{code:java}
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code}
By *disabling* the predicate pushdown rule, the job works normally.

 Also, this works normally on Spark 3.3.0. I also couldn't verify other date 
functions failing on the same way.

  was:
Using the following transformation on Spark 3.3.1:
{code:java}
df.where($"date" === last_day($"date")) {code}
Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
filter dates where they match with the last day of the month of where `date` 
happened.

Executors are dying with the following error:
{code:java}
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code}
By *{*}disabling{*}* the predicate pushdown rule, the job works normally.

 Also, this works normally on Spark 3.3.0. I also couldn't verify other date 
functions failing on the same way.


> OOM when filter pushdown `last_day` function
> 
>
> Key: SPARK-41299
> URL: https://issues.apache.org/jira/browse/SPARK-41299
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: Spark 3.3.1
> JDK 8 (openjdk version "1.8.0_352")
>Reporter: André F.
>Priority: Major
>
> Using the following transformation on Spark 3.3.1:
> {code:java}
> df.where($"date" === last_day($"date")) {code}
> Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
> filter dates where they match with the last day of the month of where `date` 
> happened.
> Executors are dying with the following error:
> {code:java}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
> at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
> at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code}
> By *disabling* the predicate pushdown rule, the job works normally.
>  Also, this works normally on Spark 3.3.0. I also couldn't verify other date 
> functions failing on the same way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41299) OOM when filter pushdown `last_day` function

2022-11-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-41299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André F. updated SPARK-41299:
-
Description: 
Using the following transformation on Spark 3.3.1:
{code:java}
df.where($"date" === last_day($"date")) {code}
Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
filter dates where they match with the last day of the month of where `date` 
happened.

Executors are dying with the following error:
{code:java}
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code}
By *{*}disabling{*}* the predicate pushdown rule, the job works normally.

 Also, this works normally on Spark 3.3.0. I also couldn't verify other date 
functions failing on the same way.

  was:
Using the following transformation on Spark 3.3.1:
df.where($"date" === last_day($"date"))
Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
filter dates where they match with the last day of the month of where `date` 
happened.

Executors are dying with the following error:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252]
By **disabling** the predicate pushdown rule, the job works normally.

 Also, this works normally on Spark 3.3.0.


> OOM when filter pushdown `last_day` function
> 
>
> Key: SPARK-41299
> URL: https://issues.apache.org/jira/browse/SPARK-41299
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: Spark 3.3.1
> JDK 8 (openjdk version "1.8.0_352")
>Reporter: André F.
>Priority: Major
>
> Using the following transformation on Spark 3.3.1:
> {code:java}
> df.where($"date" === last_day($"date")) {code}
> Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
> filter dates where they match with the last day of the month of where `date` 
> happened.
> Executors are dying with the following error:
> {code:java}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
> at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
> at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code}
> By *{*}disabling{*}* the predicate pushdown rule, the job works normally.
>  Also, this works normally on Spark 3.3.0. I also couldn't verify other date 
> functions failing on the same way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41299) OOM when filter pushdown `last_day` function

2022-11-28 Thread Jira
André F. created SPARK-41299:


 Summary: OOM when filter pushdown `last_day` function
 Key: SPARK-41299
 URL: https://issues.apache.org/jira/browse/SPARK-41299
 Project: Spark
  Issue Type: Bug
  Components: Optimizer
Affects Versions: 3.3.1
 Environment: Spark 3.3.1

JDK 8 (openjdk version "1.8.0_352")
Reporter: André F.


Using the following transformation on Spark 3.3.1:
df.where($"date" === last_day($"date"))
Where `df` is a dataframe created from a set Parquet files.  I'm trying to 
filter dates where they match with the last day of the month of where `date` 
happened.

Executors are dying with the following error:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252]
at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252]
at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252]
By **disabling** the predicate pushdown rule, the job works normally.

 Also, this works normally on Spark 3.3.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41298) Getting Count on data frame is giving the performance issue

2022-11-28 Thread Ramakrishna (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramakrishna updated SPARK-41298:

Description: 
We are invoking  below query on Teradata 

1) Dataframe df = spark.format("jdbc"). . . load();

2) int count = df.count();

When we executed the df.count spark internally issuing the below query on 
teradata which is wasting the lot of CPU on teradata and DBAs are making noise 
by seeing this query.

 

Query : SELECT 1 FROM ()SPARK_SUB_TAB

Response:

1

1

1

1

1

..

1

 

Is this expected behavior from spark or is it bug.

  was:
We are invoking  below query on Teradata 

1) Dataframe df = spark.format("jdbc"). . . load();

2) int count = df.count();

When we executed the df.count spark internally issuing the below query on 
teradata which is wasting the lot of CPU on teradata and DBAs are making noise 
by seeing this query.

 

Query : SELECT 1 FROM ()SPARK_SUB_TAB

Response:

1

1

1

1

1

..

1

 

Is this expected behavior form spark.


> Getting Count on data frame is giving the performance issue
> ---
>
> Key: SPARK-41298
> URL: https://issues.apache.org/jira/browse/SPARK-41298
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.4
>Reporter: Ramakrishna
>Priority: Major
>
> We are invoking  below query on Teradata 
> 1) Dataframe df = spark.format("jdbc"). . . load();
> 2) int count = df.count();
> When we executed the df.count spark internally issuing the below query on 
> teradata which is wasting the lot of CPU on teradata and DBAs are making 
> noise by seeing this query.
>  
> Query : SELECT 1 FROM ()SPARK_SUB_TAB
> Response:
> 1
> 1
> 1
> 1
> 1
> ..
> 1
>  
> Is this expected behavior from spark or is it bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41298) Getting Count on data frame is giving the performance issue

2022-11-28 Thread Ramakrishna (Jira)
Ramakrishna created SPARK-41298:
---

 Summary: Getting Count on data frame is giving the performance 
issue
 Key: SPARK-41298
 URL: https://issues.apache.org/jira/browse/SPARK-41298
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.4
Reporter: Ramakrishna


We are invoking  below query on Teradata 

1) Dataframe df = spark.format("jdbc"). . . load();

2) int count = df.count();

When we executed the df.count spark internally issuing the below query on 
teradata which is wasting the lot of CPU on teradata and DBAs are making noise 
by seeing this query.

 

Query : SELECT 1 FROM ()SPARK_SUB_TAB

Response:

1

1

1

1

1

..

1

 

Is this expected behavior form spark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41293) Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite

2022-11-28 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-41293:
-
Summary: Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite  
(was: Deduplicate helper method in ExpressionTypeCheckingSuite)

> Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite
> -
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.4.0
>
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-41293:


Assignee: Yang Jie

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-41293.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38820
[https://github.com/apache/spark/pull/38820]

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.4.0
>
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41297) Support string sql expressions in DF.where()

2022-11-28 Thread Martin Grund (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Grund updated SPARK-41297:
-
Priority: Blocker  (was: Major)

> Support string sql expressions in DF.where()
> 
>
> Key: SPARK-41297
> URL: https://issues.apache.org/jira/browse/SPARK-41297
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Martin Grund
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41297) Support string sql expressions in DF.where()

2022-11-28 Thread Martin Grund (Jira)
Martin Grund created SPARK-41297:


 Summary: Support string sql expressions in DF.where()
 Key: SPARK-41297
 URL: https://issues.apache.org/jira/browse/SPARK-41297
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Martin Grund






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41296) Assign a name to the error class _LEGACY_ERROR_TEMP_1106

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41296:
-
Description: Assign a name to the legacy error class 
_LEGACY_ERROR_TEMP_1106, improve error message and tests.  (was: Assign a name 
to the legacy error class _LEGACY_ERROR_TEMP_1105, improve error message and 
tests.)

> Assign a name to the error class _LEGACY_ERROR_TEMP_1106
> 
>
> Key: SPARK-41296
> URL: https://issues.apache.org/jira/browse/SPARK-41296
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>
> Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve 
> error message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41295) Assign a name to the error class _LEGACY_ERROR_TEMP_1105

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41295:
-
Description: Assign a name to the legacy error class 
_LEGACY_ERROR_TEMP_1105, improve error message and tests.  (was: Assign a name 
to the legacy error class _LEGACY_ERROR_TEMP_1203, improve error message and 
tests.)

> Assign a name to the error class _LEGACY_ERROR_TEMP_1105
> 
>
> Key: SPARK-41295
> URL: https://issues.apache.org/jira/browse/SPARK-41295
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>
> Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve 
> error message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41296) Assign a name to the error class _LEGACY_ERROR_TEMP_1106

2022-11-28 Thread Max Gekk (Jira)
Max Gekk created SPARK-41296:


 Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1106
 Key: SPARK-41296
 URL: https://issues.apache.org/jira/browse/SPARK-41296
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: BingKun Pan


Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve error 
message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41295) Assign a name to the error class _LEGACY_ERROR_TEMP_1105

2022-11-28 Thread Max Gekk (Jira)
Max Gekk created SPARK-41295:


 Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1105
 Key: SPARK-41295
 URL: https://issues.apache.org/jira/browse/SPARK-41295
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: BingKun Pan


Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve error 
message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41294:
-
Description: Assign a name to the legacy error class 
_LEGACY_ERROR_TEMP_1203, improve error message and tests.

> Assign a name to the error class _LEGACY_ERROR_TEMP_1203
> 
>
> Key: SPARK-41294
> URL: https://issues.apache.org/jira/browse/SPARK-41294
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>
> Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve 
> error message and tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203

2022-11-28 Thread Max Gekk (Jira)
Max Gekk created SPARK-41294:


 Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1203
 Key: SPARK-41294
 URL: https://issues.apache.org/jira/browse/SPARK-41294
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: BingKun Pan
Assignee: BingKun Pan
 Fix For: 3.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-41294:


Assignee: (was: BingKun Pan)

> Assign a name to the error class _LEGACY_ERROR_TEMP_1203
> 
>
> Key: SPARK-41294
> URL: https://issues.apache.org/jira/browse/SPARK-41294
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-41294:
-
Fix Version/s: (was: 3.4.0)

> Assign a name to the error class _LEGACY_ERROR_TEMP_1203
> 
>
> Key: SPARK-41294
> URL: https://issues.apache.org/jira/browse/SPARK-41294
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41287) Add a test workflow to help test image in fork repo

2022-11-28 Thread Yikun Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yikun Jiang reassigned SPARK-41287:
---

Assignee: Yikun Jiang

> Add a test workflow to help test image in fork repo
> ---
>
> Key: SPARK-41287
> URL: https://issues.apache.org/jira/browse/SPARK-41287
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Docker
>Affects Versions: 3.4.0
>Reporter: Yikun Jiang
>Assignee: Yikun Jiang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41287) Add a test workflow to help test image in fork repo

2022-11-28 Thread Yikun Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yikun Jiang resolved SPARK-41287.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 26
[https://github.com/apache/spark-docker/pull/26]

> Add a test workflow to help test image in fork repo
> ---
>
> Key: SPARK-41287
> URL: https://issues.apache.org/jira/browse/SPARK-41287
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Docker
>Affects Versions: 3.4.0
>Reporter: Yikun Jiang
>Assignee: Yikun Jiang
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41293:


Assignee: (was: Apache Spark)

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639910#comment-17639910
 ] 

Apache Spark commented on SPARK-41293:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/38820

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41293:


Assignee: Apache Spark

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41273) Update plugins to latest versions

2022-11-28 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41273.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38809
[https://github.com/apache/spark/pull/38809]

> Update plugins to latest versions
> -
>
> Key: SPARK-41273
> URL: https://issues.apache.org/jira/browse/SPARK-41273
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41273) Update plugins to latest versions

2022-11-28 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-41273:
-

Assignee: BingKun Pan

> Update plugins to latest versions
> -
>
> Key: SPARK-41273
> URL: https://issues.apache.org/jira/browse/SPARK-41273
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-41293:
-
Description: 
https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108

> Deduplicate helper method in ExpressionTypeCheckingSuite
> 
>
> Key: SPARK-41293
> URL: https://issues.apache.org/jira/browse/SPARK-41293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite

2022-11-28 Thread Yang Jie (Jira)
Yang Jie created SPARK-41293:


 Summary: Deduplicate helper method in ExpressionTypeCheckingSuite
 Key: SPARK-41293
 URL: https://issues.apache.org/jira/browse/SPARK-41293
 Project: Spark
  Issue Type: Sub-task
  Components: Tests
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41148:


Assignee: Ruifeng Zheng  (was: Apache Spark)

> Implement `DataFrame.dropna ` and `DataFrame.na.drop `
> --
>
> Key: SPARK-41148
> URL: https://issues.apache.org/jira/browse/SPARK-41148
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `

2022-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41148:


Assignee: Apache Spark  (was: Ruifeng Zheng)

> Implement `DataFrame.dropna ` and `DataFrame.na.drop `
> --
>
> Key: SPARK-41148
> URL: https://issues.apache.org/jira/browse/SPARK-41148
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639898#comment-17639898
 ] 

Apache Spark commented on SPARK-41148:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38819

> Implement `DataFrame.dropna ` and `DataFrame.na.drop `
> --
>
> Key: SPARK-41148
> URL: https://issues.apache.org/jira/browse/SPARK-41148
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639897#comment-17639897
 ] 

Apache Spark commented on SPARK-41148:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38819

> Implement `DataFrame.dropna ` and `DataFrame.na.drop `
> --
>
> Key: SPARK-41148
> URL: https://issues.apache.org/jira/browse/SPARK-41148
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41238) Support more datatypes

2022-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639892#comment-17639892
 ] 

Apache Spark commented on SPARK-41238:
--

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38818

> Support more datatypes
> --
>
> Key: SPARK-41238
> URL: https://issues.apache.org/jira/browse/SPARK-41238
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, SQL
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41272) Assign a name to the error class _LEGACY_ERROR_TEMP_2019

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-41272.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38808
[https://github.com/apache/spark/pull/38808]

> Assign a name to the error class _LEGACY_ERROR_TEMP_2019
> 
>
> Key: SPARK-41272
> URL: https://issues.apache.org/jira/browse/SPARK-41272
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41272) Assign a name to the error class _LEGACY_ERROR_TEMP_2019

2022-11-28 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-41272:


Assignee: BingKun Pan

> Assign a name to the error class _LEGACY_ERROR_TEMP_2019
> 
>
> Key: SPARK-41272
> URL: https://issues.apache.org/jira/browse/SPARK-41272
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41003) BHJ LeftAnti does not update numOutputRows when codegen is disabled

2022-11-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-41003.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38489
[https://github.com/apache/spark/pull/38489]

> BHJ LeftAnti does not update numOutputRows when codegen is disabled
> ---
>
> Key: SPARK-41003
> URL: https://issues.apache.org/jira/browse/SPARK-41003
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Minor
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41003) BHJ LeftAnti does not update numOutputRows when codegen is disabled

2022-11-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-41003:
---

Assignee: dzcxzl

> BHJ LeftAnti does not update numOutputRows when codegen is disabled
> ---
>
> Key: SPARK-41003
> URL: https://issues.apache.org/jira/browse/SPARK-41003
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2