[jira] [Commented] (SPARK-41290) Support GENERATED ALWAYS AS in create table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640184#comment-17640184 ] Apache Spark commented on SPARK-41290: -- User 'allisonport-db' has created a pull request for this issue: https://github.com/apache/spark/pull/38823 > Support GENERATED ALWAYS AS in create table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41290) Support GENERATED ALWAYS AS in create table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640183#comment-17640183 ] Apache Spark commented on SPARK-41290: -- User 'allisonport-db' has created a pull request for this issue: https://github.com/apache/spark/pull/38823 > Support GENERATED ALWAYS AS in create table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41290) Support GENERATED ALWAYS AS in create table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41290: Assignee: Apache Spark > Support GENERATED ALWAYS AS in create table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Assignee: Apache Spark >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41290) Support GENERATED ALWAYS AS in create table
[ https://issues.apache.org/jira/browse/SPARK-41290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41290: Assignee: (was: Apache Spark) > Support GENERATED ALWAYS AS in create table > --- > > Key: SPARK-41290 > URL: https://issues.apache.org/jira/browse/SPARK-41290 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Portis >Priority: Major > > Support GENERATED ALWAYS AS syntax for defining generated columns in create > table. > For example, > > {code:java} > CREATE TABLE default.example ( > time TIMESTAMP, > date DATE GENERATED ALWAYS AS (CAST(time AS DATE)) > ) > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41303) Assign a name to the error class _LEGACY_ERROR_TEMP_2422
[ https://issues.apache.org/jira/browse/SPARK-41303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41303: - Description: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2422, improve error message and tests. (was: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and tests.) > Assign a name to the error class _LEGACY_ERROR_TEMP_2422 > > > Key: SPARK-41303 > URL: https://issues.apache.org/jira/browse/SPARK-41303 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > > Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2422, improve > error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41303) Assign a name to the error class _LEGACY_ERROR_TEMP_2422
Max Gekk created SPARK-41303: Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_2422 Key: SPARK-41303 URL: https://issues.apache.org/jira/browse/SPARK-41303 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: BingKun Pan Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41302) Assign a name to the error class _LEGACY_ERROR_TEMP_1185
Max Gekk created SPARK-41302: Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1185 Key: SPARK-41302 URL: https://issues.apache.org/jira/browse/SPARK-41302 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: BingKun Pan Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41302) Assign a name to the error class _LEGACY_ERROR_TEMP_1185
[ https://issues.apache.org/jira/browse/SPARK-41302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41302: - Description: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1185, improve error message and tests. (was: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and tests.) > Assign a name to the error class _LEGACY_ERROR_TEMP_1185 > > > Key: SPARK-41302 > URL: https://issues.apache.org/jira/browse/SPARK-41302 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > > Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1185, improve > error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41180) Assign an error class to "Cannot parse the data type"
[ https://issues.apache.org/jira/browse/SPARK-41180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41180. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38754 [https://github.com/apache/spark/pull/38754] > Assign an error class to "Cannot parse the data type" > - > > Key: SPARK-41180 > URL: https://issues.apache.org/jira/browse/SPARK-41180 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Yang Jie >Priority: Major > Fix For: 3.4.0 > > > The code below shows the issue: > {code} > > select from_csv('1', 'a InvalidType'); > org.apache.spark.sql.AnalysisException > { > "errorClass" : "LEGACY", > "messageParameters" : { > "message" : "Cannot parse the data type: \n[PARSE_SYNTAX_ERROR] Syntax > error at or near 'InvalidType': extra input 'InvalidType'(line 1, pos > 2)\n\n== SQL ==\na InvalidType\n--^^^\n\nFailed fallback parsing: \nDataType > invalidtype is not supported.(line 1, pos 2)\n\n== SQL ==\na > InvalidType\n--^^^\n; line 1 pos 7" > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41180) Assign an error class to "Cannot parse the data type"
[ https://issues.apache.org/jira/browse/SPARK-41180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41180: Assignee: Yang Jie > Assign an error class to "Cannot parse the data type" > - > > Key: SPARK-41180 > URL: https://issues.apache.org/jira/browse/SPARK-41180 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Yang Jie >Priority: Major > > The code below shows the issue: > {code} > > select from_csv('1', 'a InvalidType'); > org.apache.spark.sql.AnalysisException > { > "errorClass" : "LEGACY", > "messageParameters" : { > "message" : "Cannot parse the data type: \n[PARSE_SYNTAX_ERROR] Syntax > error at or near 'InvalidType': extra input 'InvalidType'(line 1, pos > 2)\n\n== SQL ==\na InvalidType\n--^^^\n\nFailed fallback parsing: \nDataType > invalidtype is not supported.(line 1, pos 2)\n\n== SQL ==\na > InvalidType\n--^^^\n; line 1 pos 7" > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41301) SparkSession.range should treat end as optional
[ https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41301: Assignee: Apache Spark > SparkSession.range should treat end as optional > --- > > Key: SPARK-41301 > URL: https://issues.apache.org/jira/browse/SPARK-41301 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41301) SparkSession.range should treat end as optional
[ https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41301: Assignee: (was: Apache Spark) > SparkSession.range should treat end as optional > --- > > Key: SPARK-41301 > URL: https://issues.apache.org/jira/browse/SPARK-41301 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41301) SparkSession.range should treat end as optional
[ https://issues.apache.org/jira/browse/SPARK-41301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640107#comment-17640107 ] Apache Spark commented on SPARK-41301: -- User 'grundprinzip' has created a pull request for this issue: https://github.com/apache/spark/pull/38822 > SparkSession.range should treat end as optional > --- > > Key: SPARK-41301 > URL: https://issues.apache.org/jira/browse/SPARK-41301 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41301) SparkSession.range should treat end as optional
Martin Grund created SPARK-41301: Summary: SparkSession.range should treat end as optional Key: SPARK-41301 URL: https://issues.apache.org/jira/browse/SPARK-41301 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Martin Grund -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41300) Unset Read.schema is incorrectly read when unset
[ https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hövell resolved SPARK-41300. --- Fix Version/s: 3.4.0 Assignee: Martin Grund Resolution: Fixed > Unset Read.schema is incorrectly read when unset > > > Key: SPARK-41300 > URL: https://issues.apache.org/jira/browse/SPARK-41300 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Assignee: Martin Grund >Priority: Major > Fix For: 3.4.0 > > > The following query fails because the schema property is wrongly interpreted. > > ``` > readDf = self.connect.read.format("csv").option("header", > True).load(path=tmpPath) > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41254) YarnAllocator.rpIdToYarnResource map is not properly updated
[ https://issues.apache.org/jira/browse/SPARK-41254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-41254: Assignee: Zhang Liang > YarnAllocator.rpIdToYarnResource map is not properly updated > > > Key: SPARK-41254 > URL: https://issues.apache.org/jira/browse/SPARK-41254 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 3.1.0, 3.2.0, 3.3.1 >Reporter: Zhang Liang >Assignee: Zhang Liang >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > Log messages "INFO YarnAllocator: Resource profile 0 doesn't exist, adding > it" repeats multiple times in yarn stderr. > The log should be outputted only once because it happens in > _YarnAllocator.createYarnResourceForResourceProfile_ on a default > ResourceProfile > After digging into the code, I found a bug caused by misleading usage of > _ConcurrentHashMap.contains_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41254) YarnAllocator.rpIdToYarnResource map is not properly updated
[ https://issues.apache.org/jira/browse/SPARK-41254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-41254. -- Fix Version/s: 3.2.4 3.3.2 3.4.0 Resolution: Fixed Issue resolved by pull request 38790 [https://github.com/apache/spark/pull/38790] > YarnAllocator.rpIdToYarnResource map is not properly updated > > > Key: SPARK-41254 > URL: https://issues.apache.org/jira/browse/SPARK-41254 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 3.1.0, 3.2.0, 3.3.1 >Reporter: Zhang Liang >Assignee: Zhang Liang >Priority: Minor > Fix For: 3.2.4, 3.3.2, 3.4.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > Log messages "INFO YarnAllocator: Resource profile 0 doesn't exist, adding > it" repeats multiple times in yarn stderr. > The log should be outputted only once because it happens in > _YarnAllocator.createYarnResourceForResourceProfile_ on a default > ResourceProfile > After digging into the code, I found a bug caused by misleading usage of > _ConcurrentHashMap.contains_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41300) Unset Read.schema is incorrectly read when unset
[ https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41300: Assignee: Apache Spark > Unset Read.schema is incorrectly read when unset > > > Key: SPARK-41300 > URL: https://issues.apache.org/jira/browse/SPARK-41300 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Assignee: Apache Spark >Priority: Major > > The following query fails because the schema property is wrongly interpreted. > > ``` > readDf = self.connect.read.format("csv").option("header", > True).load(path=tmpPath) > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41300) Unset Read.schema is incorrectly read when unset
[ https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640027#comment-17640027 ] Apache Spark commented on SPARK-41300: -- User 'grundprinzip' has created a pull request for this issue: https://github.com/apache/spark/pull/38821 > Unset Read.schema is incorrectly read when unset > > > Key: SPARK-41300 > URL: https://issues.apache.org/jira/browse/SPARK-41300 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > The following query fails because the schema property is wrongly interpreted. > > ``` > readDf = self.connect.read.format("csv").option("header", > True).load(path=tmpPath) > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41300) Unset Read.schema is incorrectly read when unset
[ https://issues.apache.org/jira/browse/SPARK-41300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41300: Assignee: (was: Apache Spark) > Unset Read.schema is incorrectly read when unset > > > Key: SPARK-41300 > URL: https://issues.apache.org/jira/browse/SPARK-41300 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > The following query fails because the schema property is wrongly interpreted. > > ``` > readDf = self.connect.read.format("csv").option("header", > True).load(path=tmpPath) > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41300) Unset Read.schema is incorrectly read when unset
Martin Grund created SPARK-41300: Summary: Unset Read.schema is incorrectly read when unset Key: SPARK-41300 URL: https://issues.apache.org/jira/browse/SPARK-41300 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Martin Grund The following query fails because the schema property is wrongly interpreted. ``` readDf = self.connect.read.format("csv").option("header", True).load(path=tmpPath) ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41299) OOM when filter pushdown `last_day` function
[ https://issues.apache.org/jira/browse/SPARK-41299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André F. updated SPARK-41299: - Description: Using the following transformation on Spark 3.3.1: {code:java} df.where($"date" === last_day($"date")) {code} Where `df` is a dataframe created from a set Parquet files. I'm trying to filter dates where they match with the last day of the month of where `date` happened. Executors are dying with the following error: {code:java} java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code} By *disabling* the predicate pushdown rule, the job works normally. Also, this works normally on Spark 3.3.0. I also couldn't verify other date functions failing on the same way. was: Using the following transformation on Spark 3.3.1: {code:java} df.where($"date" === last_day($"date")) {code} Where `df` is a dataframe created from a set Parquet files. I'm trying to filter dates where they match with the last day of the month of where `date` happened. Executors are dying with the following error: {code:java} java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code} By *{*}disabling{*}* the predicate pushdown rule, the job works normally. Also, this works normally on Spark 3.3.0. I also couldn't verify other date functions failing on the same way. > OOM when filter pushdown `last_day` function > > > Key: SPARK-41299 > URL: https://issues.apache.org/jira/browse/SPARK-41299 > Project: Spark > Issue Type: Bug > Components: Optimizer >Affects Versions: 3.3.1 > Environment: Spark 3.3.1 > JDK 8 (openjdk version "1.8.0_352") >Reporter: André F. >Priority: Major > > Using the following transformation on Spark 3.3.1: > {code:java} > df.where($"date" === last_day($"date")) {code} > Where `df` is a dataframe created from a set Parquet files. I'm trying to > filter dates where they match with the last day of the month of where `date` > happened. > Executors are dying with the following error: > {code:java} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] > at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] > at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code} > By *disabling* the predicate pushdown rule, the job works normally. > Also, this works normally on Spark 3.3.0. I also couldn't verify other date > functions failing on the same way. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41299) OOM when filter pushdown `last_day` function
[ https://issues.apache.org/jira/browse/SPARK-41299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André F. updated SPARK-41299: - Description: Using the following transformation on Spark 3.3.1: {code:java} df.where($"date" === last_day($"date")) {code} Where `df` is a dataframe created from a set Parquet files. I'm trying to filter dates where they match with the last day of the month of where `date` happened. Executors are dying with the following error: {code:java} java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code} By *{*}disabling{*}* the predicate pushdown rule, the job works normally. Also, this works normally on Spark 3.3.0. I also couldn't verify other date functions failing on the same way. was: Using the following transformation on Spark 3.3.1: df.where($"date" === last_day($"date")) Where `df` is a dataframe created from a set Parquet files. I'm trying to filter dates where they match with the last day of the month of where `date` happened. Executors are dying with the following error: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] By **disabling** the predicate pushdown rule, the job works normally. Also, this works normally on Spark 3.3.0. > OOM when filter pushdown `last_day` function > > > Key: SPARK-41299 > URL: https://issues.apache.org/jira/browse/SPARK-41299 > Project: Spark > Issue Type: Bug > Components: Optimizer >Affects Versions: 3.3.1 > Environment: Spark 3.3.1 > JDK 8 (openjdk version "1.8.0_352") >Reporter: André F. >Priority: Major > > Using the following transformation on Spark 3.3.1: > {code:java} > df.where($"date" === last_day($"date")) {code} > Where `df` is a dataframe created from a set Parquet files. I'm trying to > filter dates where they match with the last day of the month of where `date` > happened. > Executors are dying with the following error: > {code:java} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] > at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] > at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] {code} > By *{*}disabling{*}* the predicate pushdown rule, the job works normally. > Also, this works normally on Spark 3.3.0. I also couldn't verify other date > functions failing on the same way. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41299) OOM when filter pushdown `last_day` function
André F. created SPARK-41299: Summary: OOM when filter pushdown `last_day` function Key: SPARK-41299 URL: https://issues.apache.org/jira/browse/SPARK-41299 Project: Spark Issue Type: Bug Components: Optimizer Affects Versions: 3.3.1 Environment: Spark 3.3.1 JDK 8 (openjdk version "1.8.0_352") Reporter: André F. Using the following transformation on Spark 3.3.1: df.where($"date" === last_day($"date")) Where `df` is a dataframe created from a set Parquet files. I'm trying to filter dates where they match with the last day of the month of where `date` happened. Executors are dying with the following error: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Pattern.compile(Pattern.java:1722) ~[?:1.8.0_252] at java.util.regex.Pattern.(Pattern.java:1352) ~[?:1.8.0_252] at java.util.regex.Pattern.compile(Pattern.java:1028) ~[?:1.8.0_252] By **disabling** the predicate pushdown rule, the job works normally. Also, this works normally on Spark 3.3.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41298) Getting Count on data frame is giving the performance issue
[ https://issues.apache.org/jira/browse/SPARK-41298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna updated SPARK-41298: Description: We are invoking below query on Teradata 1) Dataframe df = spark.format("jdbc"). . . load(); 2) int count = df.count(); When we executed the df.count spark internally issuing the below query on teradata which is wasting the lot of CPU on teradata and DBAs are making noise by seeing this query. Query : SELECT 1 FROM ()SPARK_SUB_TAB Response: 1 1 1 1 1 .. 1 Is this expected behavior from spark or is it bug. was: We are invoking below query on Teradata 1) Dataframe df = spark.format("jdbc"). . . load(); 2) int count = df.count(); When we executed the df.count spark internally issuing the below query on teradata which is wasting the lot of CPU on teradata and DBAs are making noise by seeing this query. Query : SELECT 1 FROM ()SPARK_SUB_TAB Response: 1 1 1 1 1 .. 1 Is this expected behavior form spark. > Getting Count on data frame is giving the performance issue > --- > > Key: SPARK-41298 > URL: https://issues.apache.org/jira/browse/SPARK-41298 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.4 >Reporter: Ramakrishna >Priority: Major > > We are invoking below query on Teradata > 1) Dataframe df = spark.format("jdbc"). . . load(); > 2) int count = df.count(); > When we executed the df.count spark internally issuing the below query on > teradata which is wasting the lot of CPU on teradata and DBAs are making > noise by seeing this query. > > Query : SELECT 1 FROM ()SPARK_SUB_TAB > Response: > 1 > 1 > 1 > 1 > 1 > .. > 1 > > Is this expected behavior from spark or is it bug. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41298) Getting Count on data frame is giving the performance issue
Ramakrishna created SPARK-41298: --- Summary: Getting Count on data frame is giving the performance issue Key: SPARK-41298 URL: https://issues.apache.org/jira/browse/SPARK-41298 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.4 Reporter: Ramakrishna We are invoking below query on Teradata 1) Dataframe df = spark.format("jdbc"). . . load(); 2) int count = df.count(); When we executed the df.count spark internally issuing the below query on teradata which is wasting the lot of CPU on teradata and DBAs are making noise by seeing this query. Query : SELECT 1 FROM ()SPARK_SUB_TAB Response: 1 1 1 1 1 .. 1 Is this expected behavior form spark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41293) Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-41293: - Summary: Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite (was: Deduplicate helper method in ExpressionTypeCheckingSuite) > Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite > - > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.4.0 > > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41293: Assignee: Yang Jie > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41293. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38820 [https://github.com/apache/spark/pull/38820] > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.4.0 > > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41297) Support string sql expressions in DF.where()
[ https://issues.apache.org/jira/browse/SPARK-41297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Grund updated SPARK-41297: - Priority: Blocker (was: Major) > Support string sql expressions in DF.where() > > > Key: SPARK-41297 > URL: https://issues.apache.org/jira/browse/SPARK-41297 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Blocker > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41297) Support string sql expressions in DF.where()
Martin Grund created SPARK-41297: Summary: Support string sql expressions in DF.where() Key: SPARK-41297 URL: https://issues.apache.org/jira/browse/SPARK-41297 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Martin Grund -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41296) Assign a name to the error class _LEGACY_ERROR_TEMP_1106
[ https://issues.apache.org/jira/browse/SPARK-41296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41296: - Description: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve error message and tests. (was: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve error message and tests.) > Assign a name to the error class _LEGACY_ERROR_TEMP_1106 > > > Key: SPARK-41296 > URL: https://issues.apache.org/jira/browse/SPARK-41296 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > > Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1106, improve > error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41295) Assign a name to the error class _LEGACY_ERROR_TEMP_1105
[ https://issues.apache.org/jira/browse/SPARK-41295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41295: - Description: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve error message and tests. (was: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve error message and tests.) > Assign a name to the error class _LEGACY_ERROR_TEMP_1105 > > > Key: SPARK-41295 > URL: https://issues.apache.org/jira/browse/SPARK-41295 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > > Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve > error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41296) Assign a name to the error class _LEGACY_ERROR_TEMP_1106
Max Gekk created SPARK-41296: Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1106 Key: SPARK-41296 URL: https://issues.apache.org/jira/browse/SPARK-41296 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: BingKun Pan Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1105, improve error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41295) Assign a name to the error class _LEGACY_ERROR_TEMP_1105
Max Gekk created SPARK-41295: Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1105 Key: SPARK-41295 URL: https://issues.apache.org/jira/browse/SPARK-41295 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: BingKun Pan Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203
[ https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41294: - Description: Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve error message and tests. > Assign a name to the error class _LEGACY_ERROR_TEMP_1203 > > > Key: SPARK-41294 > URL: https://issues.apache.org/jira/browse/SPARK-41294 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > > Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1203, improve > error message and tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203
Max Gekk created SPARK-41294: Summary: Assign a name to the error class _LEGACY_ERROR_TEMP_1203 Key: SPARK-41294 URL: https://issues.apache.org/jira/browse/SPARK-41294 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: BingKun Pan Assignee: BingKun Pan Fix For: 3.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203
[ https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41294: Assignee: (was: BingKun Pan) > Assign a name to the error class _LEGACY_ERROR_TEMP_1203 > > > Key: SPARK-41294 > URL: https://issues.apache.org/jira/browse/SPARK-41294 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41294) Assign a name to the error class _LEGACY_ERROR_TEMP_1203
[ https://issues.apache.org/jira/browse/SPARK-41294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-41294: - Fix Version/s: (was: 3.4.0) > Assign a name to the error class _LEGACY_ERROR_TEMP_1203 > > > Key: SPARK-41294 > URL: https://issues.apache.org/jira/browse/SPARK-41294 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41287) Add a test workflow to help test image in fork repo
[ https://issues.apache.org/jira/browse/SPARK-41287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang reassigned SPARK-41287: --- Assignee: Yikun Jiang > Add a test workflow to help test image in fork repo > --- > > Key: SPARK-41287 > URL: https://issues.apache.org/jira/browse/SPARK-41287 > Project: Spark > Issue Type: Sub-task > Components: Spark Docker >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41287) Add a test workflow to help test image in fork repo
[ https://issues.apache.org/jira/browse/SPARK-41287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang resolved SPARK-41287. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 26 [https://github.com/apache/spark-docker/pull/26] > Add a test workflow to help test image in fork repo > --- > > Key: SPARK-41287 > URL: https://issues.apache.org/jira/browse/SPARK-41287 > Project: Spark > Issue Type: Sub-task > Components: Spark Docker >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41293: Assignee: (was: Apache Spark) > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639910#comment-17639910 ] Apache Spark commented on SPARK-41293: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/38820 > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41293: Assignee: Apache Spark > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41273) Update plugins to latest versions
[ https://issues.apache.org/jira/browse/SPARK-41273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-41273. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38809 [https://github.com/apache/spark/pull/38809] > Update plugins to latest versions > - > > Key: SPARK-41273 > URL: https://issues.apache.org/jira/browse/SPARK-41273 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41273) Update plugins to latest versions
[ https://issues.apache.org/jira/browse/SPARK-41273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-41273: - Assignee: BingKun Pan > Update plugins to latest versions > - > > Key: SPARK-41273 > URL: https://issues.apache.org/jira/browse/SPARK-41273 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
[ https://issues.apache.org/jira/browse/SPARK-41293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-41293: - Description: https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 > Deduplicate helper method in ExpressionTypeCheckingSuite > > > Key: SPARK-41293 > URL: https://issues.apache.org/jira/browse/SPARK-41293 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > https://github.com/apache/spark/blob/d979736a9eb754725d33fd5baca88a1c1a8c23ce/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala#L61-L108 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41293) Deduplicate helper method in ExpressionTypeCheckingSuite
Yang Jie created SPARK-41293: Summary: Deduplicate helper method in ExpressionTypeCheckingSuite Key: SPARK-41293 URL: https://issues.apache.org/jira/browse/SPARK-41293 Project: Spark Issue Type: Sub-task Components: Tests Affects Versions: 3.4.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `
[ https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41148: Assignee: Ruifeng Zheng (was: Apache Spark) > Implement `DataFrame.dropna ` and `DataFrame.na.drop ` > -- > > Key: SPARK-41148 > URL: https://issues.apache.org/jira/browse/SPARK-41148 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `
[ https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41148: Assignee: Apache Spark (was: Ruifeng Zheng) > Implement `DataFrame.dropna ` and `DataFrame.na.drop ` > -- > > Key: SPARK-41148 > URL: https://issues.apache.org/jira/browse/SPARK-41148 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `
[ https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639898#comment-17639898 ] Apache Spark commented on SPARK-41148: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/38819 > Implement `DataFrame.dropna ` and `DataFrame.na.drop ` > -- > > Key: SPARK-41148 > URL: https://issues.apache.org/jira/browse/SPARK-41148 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41148) Implement `DataFrame.dropna ` and `DataFrame.na.drop `
[ https://issues.apache.org/jira/browse/SPARK-41148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639897#comment-17639897 ] Apache Spark commented on SPARK-41148: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/38819 > Implement `DataFrame.dropna ` and `DataFrame.na.drop ` > -- > > Key: SPARK-41148 > URL: https://issues.apache.org/jira/browse/SPARK-41148 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41238) Support more datatypes
[ https://issues.apache.org/jira/browse/SPARK-41238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17639892#comment-17639892 ] Apache Spark commented on SPARK-41238: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/38818 > Support more datatypes > -- > > Key: SPARK-41238 > URL: https://issues.apache.org/jira/browse/SPARK-41238 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41272) Assign a name to the error class _LEGACY_ERROR_TEMP_2019
[ https://issues.apache.org/jira/browse/SPARK-41272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41272. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38808 [https://github.com/apache/spark/pull/38808] > Assign a name to the error class _LEGACY_ERROR_TEMP_2019 > > > Key: SPARK-41272 > URL: https://issues.apache.org/jira/browse/SPARK-41272 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41272) Assign a name to the error class _LEGACY_ERROR_TEMP_2019
[ https://issues.apache.org/jira/browse/SPARK-41272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41272: Assignee: BingKun Pan > Assign a name to the error class _LEGACY_ERROR_TEMP_2019 > > > Key: SPARK-41272 > URL: https://issues.apache.org/jira/browse/SPARK-41272 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41003) BHJ LeftAnti does not update numOutputRows when codegen is disabled
[ https://issues.apache.org/jira/browse/SPARK-41003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-41003. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38489 [https://github.com/apache/spark/pull/38489] > BHJ LeftAnti does not update numOutputRows when codegen is disabled > --- > > Key: SPARK-41003 > URL: https://issues.apache.org/jira/browse/SPARK-41003 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0 >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41003) BHJ LeftAnti does not update numOutputRows when codegen is disabled
[ https://issues.apache.org/jira/browse/SPARK-41003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-41003: --- Assignee: dzcxzl > BHJ LeftAnti does not update numOutputRows when codegen is disabled > --- > > Key: SPARK-41003 > URL: https://issues.apache.org/jira/browse/SPARK-41003 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0 >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org