[jira] [Assigned] (SPARK-42542) Support Pivot without providing pivot column values
[ https://issues.apache.org/jira/browse/SPARK-42542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42542: Assignee: Apache Spark (was: Rui Wang) > Support Pivot without providing pivot column values > --- > > Key: SPARK-42542 > URL: https://issues.apache.org/jira/browse/SPARK-42542 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42542) Support Pivot without providing pivot column values
[ https://issues.apache.org/jira/browse/SPARK-42542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694152#comment-17694152 ] Apache Spark commented on SPARK-42542: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40200 > Support Pivot without providing pivot column values > --- > > Key: SPARK-42542 > URL: https://issues.apache.org/jira/browse/SPARK-42542 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42542) Support Pivot without providing pivot column values
[ https://issues.apache.org/jira/browse/SPARK-42542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42542: Assignee: Rui Wang (was: Apache Spark) > Support Pivot without providing pivot column values > --- > > Key: SPARK-42542 > URL: https://issues.apache.org/jira/browse/SPARK-42542 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42596) [YARN] OMP_NUM_THREADS not set to number of executor cores by default
[ https://issues.apache.org/jira/browse/SPARK-42596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42596: Assignee: Apache Spark > [YARN] OMP_NUM_THREADS not set to number of executor cores by default > - > > Key: SPARK-42596 > URL: https://issues.apache.org/jira/browse/SPARK-42596 > Project: Spark > Issue Type: Bug > Components: PySpark, YARN >Affects Versions: 3.3.2 >Reporter: John Zhuge >Assignee: Apache Spark >Priority: Major > > Run this PySpark script with `spark.executor.cores=1` > {code:python} > import os > from pyspark.sql import SparkSession > from pyspark.sql.functions import udf > spark = SparkSession.builder.getOrCreate() > var_name = 'OMP_NUM_THREADS' > def get_env_var(): > return os.getenv(var_name) > udf_get_env_var = udf(get_env_var) > spark.range(1).toDF("id").withColumn(f"env_{var_name}", > udf_get_env_var()).show(truncate=False) > {code} > Output with release `3.3.2`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |null | > +---+---+ > {noformat} > Output with release `3.3.0`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |1 | > +---+---+ > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42596) [YARN] OMP_NUM_THREADS not set to number of executor cores by default
[ https://issues.apache.org/jira/browse/SPARK-42596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694145#comment-17694145 ] Apache Spark commented on SPARK-42596: -- User 'jzhuge' has created a pull request for this issue: https://github.com/apache/spark/pull/40199 > [YARN] OMP_NUM_THREADS not set to number of executor cores by default > - > > Key: SPARK-42596 > URL: https://issues.apache.org/jira/browse/SPARK-42596 > Project: Spark > Issue Type: Bug > Components: PySpark, YARN >Affects Versions: 3.3.2 >Reporter: John Zhuge >Priority: Major > > Run this PySpark script with `spark.executor.cores=1` > {code:python} > import os > from pyspark.sql import SparkSession > from pyspark.sql.functions import udf > spark = SparkSession.builder.getOrCreate() > var_name = 'OMP_NUM_THREADS' > def get_env_var(): > return os.getenv(var_name) > udf_get_env_var = udf(get_env_var) > spark.range(1).toDF("id").withColumn(f"env_{var_name}", > udf_get_env_var()).show(truncate=False) > {code} > Output with release `3.3.2`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |null | > +---+---+ > {noformat} > Output with release `3.3.0`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |1 | > +---+---+ > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42596) [YARN] OMP_NUM_THREADS not set to number of executor cores by default
[ https://issues.apache.org/jira/browse/SPARK-42596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42596: Assignee: (was: Apache Spark) > [YARN] OMP_NUM_THREADS not set to number of executor cores by default > - > > Key: SPARK-42596 > URL: https://issues.apache.org/jira/browse/SPARK-42596 > Project: Spark > Issue Type: Bug > Components: PySpark, YARN >Affects Versions: 3.3.2 >Reporter: John Zhuge >Priority: Major > > Run this PySpark script with `spark.executor.cores=1` > {code:python} > import os > from pyspark.sql import SparkSession > from pyspark.sql.functions import udf > spark = SparkSession.builder.getOrCreate() > var_name = 'OMP_NUM_THREADS' > def get_env_var(): > return os.getenv(var_name) > udf_get_env_var = udf(get_env_var) > spark.range(1).toDF("id").withColumn(f"env_{var_name}", > udf_get_env_var()).show(truncate=False) > {code} > Output with release `3.3.2`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |null | > +---+---+ > {noformat} > Output with release `3.3.0`: > {noformat} > +---+---+ > |id |env_OMP_NUM_THREADS| > +---+---+ > |0 |1 | > +---+---+ > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42337) Add error class INVALID_TEMP_OBJ_REFERENCE
[ https://issues.apache.org/jira/browse/SPARK-42337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694142#comment-17694142 ] Apache Spark commented on SPARK-42337: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/40198 > Add error class INVALID_TEMP_OBJ_REFERENCE > -- > > Key: SPARK-42337 > URL: https://issues.apache.org/jira/browse/SPARK-42337 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Fix For: 3.5.0 > > > Add the new error class INVALID_TEMP_OBJ_REFERENCE and move the following > error classes to use the new one: > * _LEGACY_ERROR_TEMP_1283 > * _LEGACY_ERROR_TEMP_1284 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42337) Add error class INVALID_TEMP_OBJ_REFERENCE
[ https://issues.apache.org/jira/browse/SPARK-42337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694141#comment-17694141 ] Apache Spark commented on SPARK-42337: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/40198 > Add error class INVALID_TEMP_OBJ_REFERENCE > -- > > Key: SPARK-42337 > URL: https://issues.apache.org/jira/browse/SPARK-42337 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Fix For: 3.5.0 > > > Add the new error class INVALID_TEMP_OBJ_REFERENCE and move the following > error classes to use the new one: > * _LEGACY_ERROR_TEMP_1283 > * _LEGACY_ERROR_TEMP_1284 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42605) Implement TypedColumn
[ https://issues.apache.org/jira/browse/SPARK-42605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694088#comment-17694088 ] Apache Spark commented on SPARK-42605: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40197 > Implement TypedColumn > - > > Key: SPARK-42605 > URL: https://issues.apache.org/jira/browse/SPARK-42605 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > Implement TypesColumn for the Scala Client -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42603) Set spark.sql.legacy.createHiveTableByDefault to false
[ https://issues.apache.org/jira/browse/SPARK-42603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42603: Assignee: (was: Apache Spark) > Set spark.sql.legacy.createHiveTableByDefault to false > -- > > Key: SPARK-42603 > URL: https://issues.apache.org/jira/browse/SPARK-42603 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42603) Set spark.sql.legacy.createHiveTableByDefault to false
[ https://issues.apache.org/jira/browse/SPARK-42603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694006#comment-17694006 ] Apache Spark commented on SPARK-42603: -- User 'huangxiaopingRD' has created a pull request for this issue: https://github.com/apache/spark/pull/40196 > Set spark.sql.legacy.createHiveTableByDefault to false > -- > > Key: SPARK-42603 > URL: https://issues.apache.org/jira/browse/SPARK-42603 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42603) Set spark.sql.legacy.createHiveTableByDefault to false
[ https://issues.apache.org/jira/browse/SPARK-42603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42603: Assignee: Apache Spark > Set spark.sql.legacy.createHiveTableByDefault to false > -- > > Key: SPARK-42603 > URL: https://issues.apache.org/jira/browse/SPARK-42603 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42553) NonReserved keyword "interval" can't be column name
[ https://issues.apache.org/jira/browse/SPARK-42553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693997#comment-17693997 ] Apache Spark commented on SPARK-42553: -- User 'jiang13021' has created a pull request for this issue: https://github.com/apache/spark/pull/40195 > NonReserved keyword "interval" can't be column name > --- > > Key: SPARK-42553 > URL: https://issues.apache.org/jira/browse/SPARK-42553 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.1, 3.2.3, 3.3.2 > Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java > 1.8.0_345) > Spark version 3.2.3-SNAPSHOT >Reporter: jiang13021 >Priority: Major > > INTERVAL is a Non-Reserved keyword in spark. "Non-Reserved keywords" have a > special meaning in particular contexts and can be used as identifiers in > other contexts. So by design, interval can be used as a column name. > {code:java} > scala> spark.sql("select interval from mytable") > org.apache.spark.sql.catalyst.parser.ParseException: > at least one time unit should be given for interval literal(line 1, pos 7)== > SQL == > select interval from mytable > ---^^^ at > org.apache.spark.sql.errors.QueryParsingErrors$.invalidIntervalLiteralError(QueryParsingErrors.scala:196) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$parseIntervalLiteral$1(AstBuilder.scala:2481) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.parseIntervalLiteral(AstBuilder.scala:2466) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitInterval$1(AstBuilder.scala:2432) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:2431) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalContext.accept(SqlBaseParser.java:17308) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitIntervalLiteral(SqlBaseBaseVisitor.java:1581) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalLiteralContext.accept(SqlBaseParser.java:16929) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitConstantDefault(SqlBaseBaseVisitor.java:1511) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ConstantDefaultContext.accept(SqlBaseParser.java:15905) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitValueExpressionDefault(SqlBaseBaseVisitor.java:1392) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ValueExpressionDefaultContext.accept(SqlBaseParser.java:15298) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitPredicated$1(AstBuilder.scala:1548) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:1547) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$PredicatedContext.accept(SqlBaseParser.java:14745) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitExpression(SqlBaseBaseVisitor.java:1343) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ExpressionContext.accept(SqlBaseParser.java:14606) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpression$1(AstBuilder.scala:1434) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:1433) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$NamedExpressionContext.accept(SqlBaseParser.java:14124) > at >
[jira] [Assigned] (SPARK-42553) NonReserved keyword "interval" can't be column name
[ https://issues.apache.org/jira/browse/SPARK-42553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42553: Assignee: Apache Spark > NonReserved keyword "interval" can't be column name > --- > > Key: SPARK-42553 > URL: https://issues.apache.org/jira/browse/SPARK-42553 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.1, 3.2.3, 3.3.2 > Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java > 1.8.0_345) > Spark version 3.2.3-SNAPSHOT >Reporter: jiang13021 >Assignee: Apache Spark >Priority: Major > > INTERVAL is a Non-Reserved keyword in spark. "Non-Reserved keywords" have a > special meaning in particular contexts and can be used as identifiers in > other contexts. So by design, interval can be used as a column name. > {code:java} > scala> spark.sql("select interval from mytable") > org.apache.spark.sql.catalyst.parser.ParseException: > at least one time unit should be given for interval literal(line 1, pos 7)== > SQL == > select interval from mytable > ---^^^ at > org.apache.spark.sql.errors.QueryParsingErrors$.invalidIntervalLiteralError(QueryParsingErrors.scala:196) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$parseIntervalLiteral$1(AstBuilder.scala:2481) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.parseIntervalLiteral(AstBuilder.scala:2466) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitInterval$1(AstBuilder.scala:2432) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:2431) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalContext.accept(SqlBaseParser.java:17308) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitIntervalLiteral(SqlBaseBaseVisitor.java:1581) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalLiteralContext.accept(SqlBaseParser.java:16929) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitConstantDefault(SqlBaseBaseVisitor.java:1511) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ConstantDefaultContext.accept(SqlBaseParser.java:15905) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitValueExpressionDefault(SqlBaseBaseVisitor.java:1392) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ValueExpressionDefaultContext.accept(SqlBaseParser.java:15298) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitPredicated$1(AstBuilder.scala:1548) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:1547) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$PredicatedContext.accept(SqlBaseParser.java:14745) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitExpression(SqlBaseBaseVisitor.java:1343) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ExpressionContext.accept(SqlBaseParser.java:14606) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpression$1(AstBuilder.scala:1434) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:1433) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$NamedExpressionContext.accept(SqlBaseParser.java:14124) > at >
[jira] [Assigned] (SPARK-42553) NonReserved keyword "interval" can't be column name
[ https://issues.apache.org/jira/browse/SPARK-42553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42553: Assignee: (was: Apache Spark) > NonReserved keyword "interval" can't be column name > --- > > Key: SPARK-42553 > URL: https://issues.apache.org/jira/browse/SPARK-42553 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.1, 3.2.3, 3.3.2 > Environment: Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java > 1.8.0_345) > Spark version 3.2.3-SNAPSHOT >Reporter: jiang13021 >Priority: Major > > INTERVAL is a Non-Reserved keyword in spark. "Non-Reserved keywords" have a > special meaning in particular contexts and can be used as identifiers in > other contexts. So by design, interval can be used as a column name. > {code:java} > scala> spark.sql("select interval from mytable") > org.apache.spark.sql.catalyst.parser.ParseException: > at least one time unit should be given for interval literal(line 1, pos 7)== > SQL == > select interval from mytable > ---^^^ at > org.apache.spark.sql.errors.QueryParsingErrors$.invalidIntervalLiteralError(QueryParsingErrors.scala:196) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$parseIntervalLiteral$1(AstBuilder.scala:2481) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.parseIntervalLiteral(AstBuilder.scala:2466) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitInterval$1(AstBuilder.scala:2432) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:2431) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitInterval(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalContext.accept(SqlBaseParser.java:17308) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitIntervalLiteral(SqlBaseBaseVisitor.java:1581) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$IntervalLiteralContext.accept(SqlBaseParser.java:16929) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitConstantDefault(SqlBaseBaseVisitor.java:1511) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ConstantDefaultContext.accept(SqlBaseParser.java:15905) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitValueExpressionDefault(SqlBaseBaseVisitor.java:1392) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ValueExpressionDefaultContext.accept(SqlBaseParser.java:15298) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitPredicated$1(AstBuilder.scala:1548) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:1547) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitPredicated(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$PredicatedContext.accept(SqlBaseParser.java:14745) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitChildren(AstBuilder.scala:71) > at > org.apache.spark.sql.catalyst.parser.SqlBaseBaseVisitor.visitExpression(SqlBaseBaseVisitor.java:1343) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$ExpressionContext.accept(SqlBaseParser.java:14606) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.expression(AstBuilder.scala:1412) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.$anonfun$visitNamedExpression$1(AstBuilder.scala:1434) > at > org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:133) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:1433) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.visitNamedExpression(AstBuilder.scala:57) > at > org.apache.spark.sql.catalyst.parser.SqlBaseParser$NamedExpressionContext.accept(SqlBaseParser.java:14124) > at > org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:61) > at >
[jira] [Assigned] (SPARK-42602) Add reason string as an argument to TaskScheduler.cancelTasks
[ https://issues.apache.org/jira/browse/SPARK-42602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42602: Assignee: (was: Apache Spark) > Add reason string as an argument to TaskScheduler.cancelTasks > - > > Key: SPARK-42602 > URL: https://issues.apache.org/jira/browse/SPARK-42602 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bo Zhang >Priority: Major > > Currently tasks killed by `TaskScheduler.cancelTasks` will have a > `TaskEndReason` "TaskKilled (Stage cancelled)". We should do better at > differentiating reasons for stage cancellations (e.g. user-initiated or > caused by task failures in the stage). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42602) Add reason string as an argument to TaskScheduler.cancelTasks
[ https://issues.apache.org/jira/browse/SPARK-42602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42602: Assignee: Apache Spark > Add reason string as an argument to TaskScheduler.cancelTasks > - > > Key: SPARK-42602 > URL: https://issues.apache.org/jira/browse/SPARK-42602 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bo Zhang >Assignee: Apache Spark >Priority: Major > > Currently tasks killed by `TaskScheduler.cancelTasks` will have a > `TaskEndReason` "TaskKilled (Stage cancelled)". We should do better at > differentiating reasons for stage cancellations (e.g. user-initiated or > caused by task failures in the stage). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42602) Add reason string as an argument to TaskScheduler.cancelTasks
[ https://issues.apache.org/jira/browse/SPARK-42602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693982#comment-17693982 ] Apache Spark commented on SPARK-42602: -- User 'bozhang2820' has created a pull request for this issue: https://github.com/apache/spark/pull/40194 > Add reason string as an argument to TaskScheduler.cancelTasks > - > > Key: SPARK-42602 > URL: https://issues.apache.org/jira/browse/SPARK-42602 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Bo Zhang >Priority: Major > > Currently tasks killed by `TaskScheduler.cancelTasks` will have a > `TaskEndReason` "TaskKilled (Stage cancelled)". We should do better at > differentiating reasons for stage cancellations (e.g. user-initiated or > caused by task failures in the stage). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42599) Make `CompatibilitySuite` as a tool like `dev/mima`
[ https://issues.apache.org/jira/browse/SPARK-42599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693970#comment-17693970 ] Apache Spark commented on SPARK-42599: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40191 > Make `CompatibilitySuite` as a tool like `dev/mima` > --- > > Key: SPARK-42599 > URL: https://issues.apache.org/jira/browse/SPARK-42599 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Yang Jie >Priority: Major > > Using maven to test `CompatibilitySuite` requires some pre-work(need maven > build sql & > connect-client-jvm module before test), so when we run `mvn package test`, > there will be following errors: > > {code:java} > CompatibilitySuite: > - compatibility MiMa tests *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$1(CompatibilitySuite.scala:69) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > ... > - compatibility API tests: Dataset *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$7(CompatibilitySuite.scala:110) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42599) Make `CompatibilitySuite` as a tool like `dev/mima`
[ https://issues.apache.org/jira/browse/SPARK-42599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42599: Assignee: (was: Apache Spark) > Make `CompatibilitySuite` as a tool like `dev/mima` > --- > > Key: SPARK-42599 > URL: https://issues.apache.org/jira/browse/SPARK-42599 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Yang Jie >Priority: Major > > Using maven to test `CompatibilitySuite` requires some pre-work(need maven > build sql & > connect-client-jvm module before test), so when we run `mvn package test`, > there will be following errors: > > {code:java} > CompatibilitySuite: > - compatibility MiMa tests *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$1(CompatibilitySuite.scala:69) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > ... > - compatibility API tests: Dataset *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$7(CompatibilitySuite.scala:110) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42599) Make `CompatibilitySuite` as a tool like `dev/mima`
[ https://issues.apache.org/jira/browse/SPARK-42599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42599: Assignee: Apache Spark > Make `CompatibilitySuite` as a tool like `dev/mima` > --- > > Key: SPARK-42599 > URL: https://issues.apache.org/jira/browse/SPARK-42599 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > > Using maven to test `CompatibilitySuite` requires some pre-work(need maven > build sql & > connect-client-jvm module before test), so when we run `mvn package test`, > there will be following errors: > > {code:java} > CompatibilitySuite: > - compatibility MiMa tests *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$1(CompatibilitySuite.scala:69) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > ... > - compatibility API tests: Dataset *** FAILED *** > java.lang.AssertionError: assertion failed: Failed to find the jar inside > folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53) > at > org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$7(CompatibilitySuite.scala:110) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42528) Optimize PercentileHeap
[ https://issues.apache.org/jira/browse/SPARK-42528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693933#comment-17693933 ] Apache Spark commented on SPARK-42528: -- User 'alkis' has created a pull request for this issue: https://github.com/apache/spark/pull/40193 > Optimize PercentileHeap > --- > > Key: SPARK-42528 > URL: https://issues.apache.org/jira/browse/SPARK-42528 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Alkis Evlogimenos >Assignee: Alkis Evlogimenos >Priority: Major > Fix For: 3.5.0 > > > It is not fast enough when used inside the scheduler for estimations which > slows down scheduling rate and as a result query execution time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42600) currentDatabase Shall use NamespaceHelper instead of MultipartIdentifierHelper
[ https://issues.apache.org/jira/browse/SPARK-42600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42600: Assignee: Apache Spark > currentDatabase Shall use NamespaceHelper instead of > MultipartIdentifierHelper > --- > > Key: SPARK-42600 > URL: https://issues.apache.org/jira/browse/SPARK-42600 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.2 >Reporter: Kent Yao >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42600) currentDatabase Shall use NamespaceHelper instead of MultipartIdentifierHelper
[ https://issues.apache.org/jira/browse/SPARK-42600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42600: Assignee: (was: Apache Spark) > currentDatabase Shall use NamespaceHelper instead of > MultipartIdentifierHelper > --- > > Key: SPARK-42600 > URL: https://issues.apache.org/jira/browse/SPARK-42600 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.2 >Reporter: Kent Yao >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42600) currentDatabase Shall use NamespaceHelper instead of MultipartIdentifierHelper
[ https://issues.apache.org/jira/browse/SPARK-42600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693925#comment-17693925 ] Apache Spark commented on SPARK-42600: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/40192 > currentDatabase Shall use NamespaceHelper instead of > MultipartIdentifierHelper > --- > > Key: SPARK-42600 > URL: https://issues.apache.org/jira/browse/SPARK-42600 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.2 >Reporter: Kent Yao >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42554) Spark Connect Scala Client
[ https://issues.apache.org/jira/browse/SPARK-42554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693916#comment-17693916 ] Apache Spark commented on SPARK-42554: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40191 > Spark Connect Scala Client > -- > > Key: SPARK-42554 > URL: https://issues.apache.org/jira/browse/SPARK-42554 > Project: Spark > Issue Type: Epic > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > This is the EPIC to track all the work for the Spark Connect Scala Client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42554) Spark Connect Scala Client
[ https://issues.apache.org/jira/browse/SPARK-42554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42554: Assignee: Herman van Hövell (was: Apache Spark) > Spark Connect Scala Client > -- > > Key: SPARK-42554 > URL: https://issues.apache.org/jira/browse/SPARK-42554 > Project: Spark > Issue Type: Epic > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > This is the EPIC to track all the work for the Spark Connect Scala Client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42554) Spark Connect Scala Client
[ https://issues.apache.org/jira/browse/SPARK-42554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42554: Assignee: Apache Spark (was: Herman van Hövell) > Spark Connect Scala Client > -- > > Key: SPARK-42554 > URL: https://issues.apache.org/jira/browse/SPARK-42554 > Project: Spark > Issue Type: Epic > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > This is the EPIC to track all the work for the Spark Connect Scala Client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42597) UnwrapCastInBinaryComparison support unwrap timestamp type
[ https://issues.apache.org/jira/browse/SPARK-42597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693915#comment-17693915 ] Apache Spark commented on SPARK-42597: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40190 > UnwrapCastInBinaryComparison support unwrap timestamp type > -- > > Key: SPARK-42597 > URL: https://issues.apache.org/jira/browse/SPARK-42597 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42597) UnwrapCastInBinaryComparison support unwrap timestamp type
[ https://issues.apache.org/jira/browse/SPARK-42597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693914#comment-17693914 ] Apache Spark commented on SPARK-42597: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40190 > UnwrapCastInBinaryComparison support unwrap timestamp type > -- > > Key: SPARK-42597 > URL: https://issues.apache.org/jira/browse/SPARK-42597 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42597) UnwrapCastInBinaryComparison support unwrap timestamp type
[ https://issues.apache.org/jira/browse/SPARK-42597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42597: Assignee: Apache Spark > UnwrapCastInBinaryComparison support unwrap timestamp type > -- > > Key: SPARK-42597 > URL: https://issues.apache.org/jira/browse/SPARK-42597 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42597) UnwrapCastInBinaryComparison support unwrap timestamp type
[ https://issues.apache.org/jira/browse/SPARK-42597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42597: Assignee: (was: Apache Spark) > UnwrapCastInBinaryComparison support unwrap timestamp type > -- > > Key: SPARK-42597 > URL: https://issues.apache.org/jira/browse/SPARK-42597 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42598) Refactor TPCH schema to separate file similar to TPCDS for code reuse
[ https://issues.apache.org/jira/browse/SPARK-42598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42598: Assignee: (was: Apache Spark) > Refactor TPCH schema to separate file similar to TPCDS for code reuse > - > > Key: SPARK-42598 > URL: https://issues.apache.org/jira/browse/SPARK-42598 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.5.0 >Reporter: Kapil Singh >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42598) Refactor TPCH schema to separate file similar to TPCDS for code reuse
[ https://issues.apache.org/jira/browse/SPARK-42598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42598: Assignee: Apache Spark > Refactor TPCH schema to separate file similar to TPCDS for code reuse > - > > Key: SPARK-42598 > URL: https://issues.apache.org/jira/browse/SPARK-42598 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.5.0 >Reporter: Kapil Singh >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42598) Refactor TPCH schema to separate file similar to TPCDS for code reuse
[ https://issues.apache.org/jira/browse/SPARK-42598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693897#comment-17693897 ] Apache Spark commented on SPARK-42598: -- User 'Surbhi-Vijay' has created a pull request for this issue: https://github.com/apache/spark/pull/40171 > Refactor TPCH schema to separate file similar to TPCDS for code reuse > - > > Key: SPARK-42598 > URL: https://issues.apache.org/jira/browse/SPARK-42598 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.5.0 >Reporter: Kapil Singh >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-27483) Move the data source v2 fallback to v1 logic to an analyzer rule
[ https://issues.apache.org/jira/browse/SPARK-27483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27483: Assignee: (was: Apache Spark) > Move the data source v2 fallback to v1 logic to an analyzer rule > > > Key: SPARK-27483 > URL: https://issues.apache.org/jira/browse/SPARK-27483 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27483) Move the data source v2 fallback to v1 logic to an analyzer rule
[ https://issues.apache.org/jira/browse/SPARK-27483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693889#comment-17693889 ] Apache Spark commented on SPARK-27483: -- User 'WweiL' has created a pull request for this issue: https://github.com/apache/spark/pull/40189 > Move the data source v2 fallback to v1 logic to an analyzer rule > > > Key: SPARK-27483 > URL: https://issues.apache.org/jira/browse/SPARK-27483 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-27483) Move the data source v2 fallback to v1 logic to an analyzer rule
[ https://issues.apache.org/jira/browse/SPARK-27483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27483: Assignee: Apache Spark > Move the data source v2 fallback to v1 logic to an analyzer rule > > > Key: SPARK-27483 > URL: https://issues.apache.org/jira/browse/SPARK-27483 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42592) Document SS guide doc for supporting multiple stateful operators (especially chained aggregations)
[ https://issues.apache.org/jira/browse/SPARK-42592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42592: Assignee: (was: Apache Spark) > Document SS guide doc for supporting multiple stateful operators (especially > chained aggregations) > -- > > Key: SPARK-42592 > URL: https://issues.apache.org/jira/browse/SPARK-42592 > Project: Spark > Issue Type: Documentation > Components: Structured Streaming >Affects Versions: 3.5.0 >Reporter: Jungtaek Lim >Priority: Major > > We made a change on the guide doc for SPARK-40925 via SPARK-42105, but from > SPARK-42105 we only removed the section of "limitation of global watermark". > That said, we haven't provided any example of new functionality, especially > that users need to know about the change of SQL function (window) in chained > time window aggregations. > In this ticket, we will add the example of chained time window aggregations, > with introducing new functionality of SQL function. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42592) Document SS guide doc for supporting multiple stateful operators (especially chained aggregations)
[ https://issues.apache.org/jira/browse/SPARK-42592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693869#comment-17693869 ] Apache Spark commented on SPARK-42592: -- User 'HeartSaVioR' has created a pull request for this issue: https://github.com/apache/spark/pull/40188 > Document SS guide doc for supporting multiple stateful operators (especially > chained aggregations) > -- > > Key: SPARK-42592 > URL: https://issues.apache.org/jira/browse/SPARK-42592 > Project: Spark > Issue Type: Documentation > Components: Structured Streaming >Affects Versions: 3.5.0 >Reporter: Jungtaek Lim >Priority: Major > > We made a change on the guide doc for SPARK-40925 via SPARK-42105, but from > SPARK-42105 we only removed the section of "limitation of global watermark". > That said, we haven't provided any example of new functionality, especially > that users need to know about the change of SQL function (window) in chained > time window aggregations. > In this ticket, we will add the example of chained time window aggregations, > with introducing new functionality of SQL function. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42592) Document SS guide doc for supporting multiple stateful operators (especially chained aggregations)
[ https://issues.apache.org/jira/browse/SPARK-42592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42592: Assignee: Apache Spark > Document SS guide doc for supporting multiple stateful operators (especially > chained aggregations) > -- > > Key: SPARK-42592 > URL: https://issues.apache.org/jira/browse/SPARK-42592 > Project: Spark > Issue Type: Documentation > Components: Structured Streaming >Affects Versions: 3.5.0 >Reporter: Jungtaek Lim >Assignee: Apache Spark >Priority: Major > > We made a change on the guide doc for SPARK-40925 via SPARK-42105, but from > SPARK-42105 we only removed the section of "limitation of global watermark". > That said, we haven't provided any example of new functionality, especially > that users need to know about the change of SQL function (window) in chained > time window aggregations. > In this ticket, we will add the example of chained time window aggregations, > with introducing new functionality of SQL function. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42572) Logic error for StateStore.validateStateRowFormat
[ https://issues.apache.org/jira/browse/SPARK-42572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693835#comment-17693835 ] Apache Spark commented on SPARK-42572: -- User 'WweiL' has created a pull request for this issue: https://github.com/apache/spark/pull/40187 > Logic error for StateStore.validateStateRowFormat > - > > Key: SPARK-42572 > URL: https://issues.apache.org/jira/browse/SPARK-42572 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: Wei Liu >Priority: Major > > SPARK-42484 Changed the logic of whether to check state store format in > StateStore.validateStateRowFormat. Revert it and add unit test to make sure > this won't happen again -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42572) Logic error for StateStore.validateStateRowFormat
[ https://issues.apache.org/jira/browse/SPARK-42572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42572: Assignee: Apache Spark > Logic error for StateStore.validateStateRowFormat > - > > Key: SPARK-42572 > URL: https://issues.apache.org/jira/browse/SPARK-42572 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: Wei Liu >Assignee: Apache Spark >Priority: Major > > SPARK-42484 Changed the logic of whether to check state store format in > StateStore.validateStateRowFormat. Revert it and add unit test to make sure > this won't happen again -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42572) Logic error for StateStore.validateStateRowFormat
[ https://issues.apache.org/jira/browse/SPARK-42572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42572: Assignee: (was: Apache Spark) > Logic error for StateStore.validateStateRowFormat > - > > Key: SPARK-42572 > URL: https://issues.apache.org/jira/browse/SPARK-42572 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: Wei Liu >Priority: Major > > SPARK-42484 Changed the logic of whether to check state store format in > StateStore.validateStateRowFormat. Revert it and add unit test to make sure > this won't happen again -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42581) Add SparkSession implicits
[ https://issues.apache.org/jira/browse/SPARK-42581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693782#comment-17693782 ] Apache Spark commented on SPARK-42581: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40186 > Add SparkSession implicits > -- > > Key: SPARK-42581 > URL: https://issues.apache.org/jira/browse/SPARK-42581 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42586) Implement RuntimeConf
[ https://issues.apache.org/jira/browse/SPARK-42586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693777#comment-17693777 ] Apache Spark commented on SPARK-42586: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40185 > Implement RuntimeConf > - > > Key: SPARK-42586 > URL: https://issues.apache.org/jira/browse/SPARK-42586 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > Implement RuntimeConf for the Scala Client -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693776#comment-17693776 ] Apache Spark commented on SPARK-42569: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40184 > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693775#comment-17693775 ] Apache Spark commented on SPARK-42569: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40184 > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42587) Use wrapper versions for SBT and Maven in `connect` module tests
[ https://issues.apache.org/jira/browse/SPARK-42587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693772#comment-17693772 ] Apache Spark commented on SPARK-42587: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40183 > Use wrapper versions for SBT and Maven in `connect` module tests > > > Key: SPARK-42587 > URL: https://issues.apache.org/jira/browse/SPARK-42587 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42588) collapse two adjacent windows with the equivalent partition/order expression
[ https://issues.apache.org/jira/browse/SPARK-42588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693766#comment-17693766 ] Apache Spark commented on SPARK-42588: -- User 'zml1206' has created a pull request for this issue: https://github.com/apache/spark/pull/40182 > collapse two adjacent windows with the equivalent partition/order expression > > > Key: SPARK-42588 > URL: https://issues.apache.org/jira/browse/SPARK-42588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.3, 3.3.2 >Reporter: zhuml >Priority: Major > > Extend the CollapseWindow rule to collapse Window nodes with the equivalent > partition/order expressions > {code:java} > Seq((1, 1), (2, 2)).toDF("a", "b") > .withColumn("max_b", expr("max(b) OVER (PARTITION BY abs(a))")) > .withColumn("min_b", expr("min(b) OVER (PARTITION BY abs(a))")) > == Optimized Logical Plan == > before > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [min(b#8) windowspecdefinition(_w0#19, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#19] >+- Project [a#7, b#8, max_b#11, abs(a#7) AS _w0#19] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11], [_w0#13] > +- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > after > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11, min(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#13] >+- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42588) collapse two adjacent windows with the equivalent partition/order expression
[ https://issues.apache.org/jira/browse/SPARK-42588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42588: Assignee: (was: Apache Spark) > collapse two adjacent windows with the equivalent partition/order expression > > > Key: SPARK-42588 > URL: https://issues.apache.org/jira/browse/SPARK-42588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.3, 3.3.2 >Reporter: zhuml >Priority: Major > > Extend the CollapseWindow rule to collapse Window nodes with the equivalent > partition/order expressions > {code:java} > Seq((1, 1), (2, 2)).toDF("a", "b") > .withColumn("max_b", expr("max(b) OVER (PARTITION BY abs(a))")) > .withColumn("min_b", expr("min(b) OVER (PARTITION BY abs(a))")) > == Optimized Logical Plan == > before > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [min(b#8) windowspecdefinition(_w0#19, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#19] >+- Project [a#7, b#8, max_b#11, abs(a#7) AS _w0#19] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11], [_w0#13] > +- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > after > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11, min(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#13] >+- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42588) collapse two adjacent windows with the equivalent partition/order expression
[ https://issues.apache.org/jira/browse/SPARK-42588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42588: Assignee: Apache Spark > collapse two adjacent windows with the equivalent partition/order expression > > > Key: SPARK-42588 > URL: https://issues.apache.org/jira/browse/SPARK-42588 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.3, 3.3.2 >Reporter: zhuml >Assignee: Apache Spark >Priority: Major > > Extend the CollapseWindow rule to collapse Window nodes with the equivalent > partition/order expressions > {code:java} > Seq((1, 1), (2, 2)).toDF("a", "b") > .withColumn("max_b", expr("max(b) OVER (PARTITION BY abs(a))")) > .withColumn("min_b", expr("min(b) OVER (PARTITION BY abs(a))")) > == Optimized Logical Plan == > before > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [min(b#8) windowspecdefinition(_w0#19, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#19] >+- Project [a#7, b#8, max_b#11, abs(a#7) AS _w0#19] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11], [_w0#13] > +- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > after > Project [a#7, b#8, max_b#11, min_b#17] > +- Window [max(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS max_b#11, min(b#8) windowspecdefinition(_w0#13, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS min_b#17], [_w0#13] >+- Project [_1#2 AS a#7, _2#3 AS b#8, abs(_1#2) AS _w0#13] > +- LocalRelation [_1#2, _2#3] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42589) Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite
[ https://issues.apache.org/jira/browse/SPARK-42589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693764#comment-17693764 ] Apache Spark commented on SPARK-42589: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40181 > Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite > > > Key: SPARK-42589 > URL: https://issues.apache.org/jira/browse/SPARK-42589 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42589) Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite
[ https://issues.apache.org/jira/browse/SPARK-42589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42589: Assignee: (was: Apache Spark) > Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite > > > Key: SPARK-42589 > URL: https://issues.apache.org/jira/browse/SPARK-42589 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42589) Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite
[ https://issues.apache.org/jira/browse/SPARK-42589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693763#comment-17693763 ] Apache Spark commented on SPARK-42589: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40181 > Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite > > > Key: SPARK-42589 > URL: https://issues.apache.org/jira/browse/SPARK-42589 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42589) Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite
[ https://issues.apache.org/jira/browse/SPARK-42589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42589: Assignee: Apache Spark > Exclude `RelationalGroupedDataset.apply` from CompatibilitySuite > > > Key: SPARK-42589 > URL: https://issues.apache.org/jira/browse/SPARK-42589 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42587) Use wrapper versions for SBT and Maven in `connect` module tests
[ https://issues.apache.org/jira/browse/SPARK-42587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693749#comment-17693749 ] Apache Spark commented on SPARK-42587: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40180 > Use wrapper versions for SBT and Maven in `connect` module tests > > > Key: SPARK-42587 > URL: https://issues.apache.org/jira/browse/SPARK-42587 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42587) Use wrapper versions for SBT and Maven in `connect` module tests
[ https://issues.apache.org/jira/browse/SPARK-42587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42587: Assignee: (was: Apache Spark) > Use wrapper versions for SBT and Maven in `connect` module tests > > > Key: SPARK-42587 > URL: https://issues.apache.org/jira/browse/SPARK-42587 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42587) Use wrapper versions for SBT and Maven in `connect` module tests
[ https://issues.apache.org/jira/browse/SPARK-42587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42587: Assignee: Apache Spark > Use wrapper versions for SBT and Maven in `connect` module tests > > > Key: SPARK-42587 > URL: https://issues.apache.org/jira/browse/SPARK-42587 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42587) Use wrapper versions for SBT and Maven in `connect` module tests
[ https://issues.apache.org/jira/browse/SPARK-42587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693748#comment-17693748 ] Apache Spark commented on SPARK-42587: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40180 > Use wrapper versions for SBT and Maven in `connect` module tests > > > Key: SPARK-42587 > URL: https://issues.apache.org/jira/browse/SPARK-42587 > Project: Spark > Issue Type: Test > Components: Connect, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42560) Implement ColumnName
[ https://issues.apache.org/jira/browse/SPARK-42560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693730#comment-17693730 ] Apache Spark commented on SPARK-42560: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40179 > Implement ColumnName > > > Key: SPARK-42560 > URL: https://issues.apache.org/jira/browse/SPARK-42560 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > Implement ColumnName class for connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42560) Implement ColumnName
[ https://issues.apache.org/jira/browse/SPARK-42560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693729#comment-17693729 ] Apache Spark commented on SPARK-42560: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40179 > Implement ColumnName > > > Key: SPARK-42560 > URL: https://issues.apache.org/jira/browse/SPARK-42560 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > > Implement ColumnName class for connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42583) Remove outer join if all aggregate functions are distinct
[ https://issues.apache.org/jira/browse/SPARK-42583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693649#comment-17693649 ] Apache Spark commented on SPARK-42583: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40177 > Remove outer join if all aggregate functions are distinct > - > > Key: SPARK-42583 > URL: https://issues.apache.org/jira/browse/SPARK-42583 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42583) Remove outer join if all aggregate functions are distinct
[ https://issues.apache.org/jira/browse/SPARK-42583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42583: Assignee: (was: Apache Spark) > Remove outer join if all aggregate functions are distinct > - > > Key: SPARK-42583 > URL: https://issues.apache.org/jira/browse/SPARK-42583 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42583) Remove outer join if all aggregate functions are distinct
[ https://issues.apache.org/jira/browse/SPARK-42583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693648#comment-17693648 ] Apache Spark commented on SPARK-42583: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40177 > Remove outer join if all aggregate functions are distinct > - > > Key: SPARK-42583 > URL: https://issues.apache.org/jira/browse/SPARK-42583 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42583) Remove outer join if all aggregate functions are distinct
[ https://issues.apache.org/jira/browse/SPARK-42583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42583: Assignee: Apache Spark > Remove outer join if all aggregate functions are distinct > - > > Key: SPARK-42583 > URL: https://issues.apache.org/jira/browse/SPARK-42583 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42564) Implement Dataset.version and Dataset.time
[ https://issues.apache.org/jira/browse/SPARK-42564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693643#comment-17693643 ] Apache Spark commented on SPARK-42564: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40176 > Implement Dataset.version and Dataset.time > -- > > Key: SPARK-42564 > URL: https://issues.apache.org/jira/browse/SPARK-42564 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: BingKun Pan >Priority: Major > > Implement Dataset.version and Dataset.time > {code:java} > /** > * The version of Spark on which this application is running. > * > * @since 2.0.0 > */ > def version: String = SPARK_VERSION > /** > * Executes some code block and prints to stdout the time taken to execute > the block. This is > * available in Scala only and is used primarily for interactive testing and > debugging. > * > * @since 2.1.0 > */ > def time[T](f: => T): T = { > val start = System.nanoTime() > val ret = f > val end = System.nanoTime() > // scalastyle:off println > println(s"Time taken: ${NANOSECONDS.toMillis(end - start)} ms") > // scalastyle:on println > ret > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42564) Implement Dataset.version and Dataset.time
[ https://issues.apache.org/jira/browse/SPARK-42564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42564: Assignee: BingKun Pan (was: Apache Spark) > Implement Dataset.version and Dataset.time > -- > > Key: SPARK-42564 > URL: https://issues.apache.org/jira/browse/SPARK-42564 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: BingKun Pan >Priority: Major > > Implement Dataset.version and Dataset.time > {code:java} > /** > * The version of Spark on which this application is running. > * > * @since 2.0.0 > */ > def version: String = SPARK_VERSION > /** > * Executes some code block and prints to stdout the time taken to execute > the block. This is > * available in Scala only and is used primarily for interactive testing and > debugging. > * > * @since 2.1.0 > */ > def time[T](f: => T): T = { > val start = System.nanoTime() > val ret = f > val end = System.nanoTime() > // scalastyle:off println > println(s"Time taken: ${NANOSECONDS.toMillis(end - start)} ms") > // scalastyle:on println > ret > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42564) Implement Dataset.version and Dataset.time
[ https://issues.apache.org/jira/browse/SPARK-42564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42564: Assignee: Apache Spark (was: BingKun Pan) > Implement Dataset.version and Dataset.time > -- > > Key: SPARK-42564 > URL: https://issues.apache.org/jira/browse/SPARK-42564 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > Implement Dataset.version and Dataset.time > {code:java} > /** > * The version of Spark on which this application is running. > * > * @since 2.0.0 > */ > def version: String = SPARK_VERSION > /** > * Executes some code block and prints to stdout the time taken to execute > the block. This is > * available in Scala only and is used primarily for interactive testing and > debugging. > * > * @since 2.1.0 > */ > def time[T](f: => T): T = { > val start = System.nanoTime() > val ret = f > val end = System.nanoTime() > // scalastyle:off println > println(s"Time taken: ${NANOSECONDS.toMillis(end - start)} ms") > // scalastyle:on println > ret > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42564) Implement Dataset.version and Dataset.time
[ https://issues.apache.org/jira/browse/SPARK-42564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693642#comment-17693642 ] Apache Spark commented on SPARK-42564: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40176 > Implement Dataset.version and Dataset.time > -- > > Key: SPARK-42564 > URL: https://issues.apache.org/jira/browse/SPARK-42564 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: BingKun Pan >Priority: Major > > Implement Dataset.version and Dataset.time > {code:java} > /** > * The version of Spark on which this application is running. > * > * @since 2.0.0 > */ > def version: String = SPARK_VERSION > /** > * Executes some code block and prints to stdout the time taken to execute > the block. This is > * available in Scala only and is used primarily for interactive testing and > debugging. > * > * @since 2.1.0 > */ > def time[T](f: => T): T = { > val start = System.nanoTime() > val ret = f > val end = System.nanoTime() > // scalastyle:off println > println(s"Time taken: ${NANOSECONDS.toMillis(end - start)} ms") > // scalastyle:on println > ret > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42580) Add initial typed Dataset APIs
[ https://issues.apache.org/jira/browse/SPARK-42580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693577#comment-17693577 ] Apache Spark commented on SPARK-42580: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40175 > Add initial typed Dataset APIs > -- > > Key: SPARK-42580 > URL: https://issues.apache.org/jira/browse/SPARK-42580 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42580) Add initial typed Dataset APIs
[ https://issues.apache.org/jira/browse/SPARK-42580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693578#comment-17693578 ] Apache Spark commented on SPARK-42580: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40175 > Add initial typed Dataset APIs > -- > > Key: SPARK-42580 > URL: https://issues.apache.org/jira/browse/SPARK-42580 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42573) Enable binary compatibility tests for SparkSession/Dataset/Column/functions
[ https://issues.apache.org/jira/browse/SPARK-42573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693566#comment-17693566 ] Apache Spark commented on SPARK-42573: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40174 > Enable binary compatibility tests for SparkSession/Dataset/Column/functions > --- > > Key: SPARK-42573 > URL: https://issues.apache.org/jira/browse/SPARK-42573 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Assignee: Zhen Li >Priority: Major > Fix For: 3.4.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42576) Add 2nd groupBy method to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42576: Assignee: Rui Wang (was: Apache Spark) > Add 2nd groupBy method to Dataset > - > > Key: SPARK-42576 > URL: https://issues.apache.org/jira/browse/SPARK-42576 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Rui Wang >Priority: Major > > Dataset is missing a groupBy method: > {code:java} > /** > * Groups the Dataset using the specified columns, so that we can run > aggregation on them. > * See [[RelationalGroupedDataset]] for all the available aggregate functions. > * > * This is a variant of groupBy that can only group by existing columns using > column names > * (i.e. cannot construct expressions). > * > * {{{ > * // Compute the average for all numeric columns grouped by department. > * ds.groupBy("department").avg() > * > * // Compute the max age and average salary, grouped by department and > gender. > * ds.groupBy($"department", $"gender").agg(Map( > * "salary" -> "avg", > * "age" -> "max" > * )) > * }}} > * @group untypedrel > * @since 3.4.0 > */ > @scala.annotation.varargs > def groupBy(col1: String, cols: String*): RelationalGroupedDataset {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42576) Add 2nd groupBy method to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693557#comment-17693557 ] Apache Spark commented on SPARK-42576: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40173 > Add 2nd groupBy method to Dataset > - > > Key: SPARK-42576 > URL: https://issues.apache.org/jira/browse/SPARK-42576 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Rui Wang >Priority: Major > > Dataset is missing a groupBy method: > {code:java} > /** > * Groups the Dataset using the specified columns, so that we can run > aggregation on them. > * See [[RelationalGroupedDataset]] for all the available aggregate functions. > * > * This is a variant of groupBy that can only group by existing columns using > column names > * (i.e. cannot construct expressions). > * > * {{{ > * // Compute the average for all numeric columns grouped by department. > * ds.groupBy("department").avg() > * > * // Compute the max age and average salary, grouped by department and > gender. > * ds.groupBy($"department", $"gender").agg(Map( > * "salary" -> "avg", > * "age" -> "max" > * )) > * }}} > * @group untypedrel > * @since 3.4.0 > */ > @scala.annotation.varargs > def groupBy(col1: String, cols: String*): RelationalGroupedDataset {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42576) Add 2nd groupBy method to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42576: Assignee: Apache Spark (was: Rui Wang) > Add 2nd groupBy method to Dataset > - > > Key: SPARK-42576 > URL: https://issues.apache.org/jira/browse/SPARK-42576 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > Dataset is missing a groupBy method: > {code:java} > /** > * Groups the Dataset using the specified columns, so that we can run > aggregation on them. > * See [[RelationalGroupedDataset]] for all the available aggregate functions. > * > * This is a variant of groupBy that can only group by existing columns using > column names > * (i.e. cannot construct expressions). > * > * {{{ > * // Compute the average for all numeric columns grouped by department. > * ds.groupBy("department").avg() > * > * // Compute the max age and average salary, grouped by department and > gender. > * ds.groupBy($"department", $"gender").agg(Map( > * "salary" -> "avg", > * "age" -> "max" > * )) > * }}} > * @group untypedrel > * @since 3.4.0 > */ > @scala.annotation.varargs > def groupBy(col1: String, cols: String*): RelationalGroupedDataset {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693552#comment-17693552 ] Apache Spark commented on SPARK-42569: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40172 > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42574) DataFrame.toPandas should handle duplicated column names
[ https://issues.apache.org/jira/browse/SPARK-42574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42574: Assignee: Apache Spark > DataFrame.toPandas should handle duplicated column names > > > Key: SPARK-42574 > URL: https://issues.apache.org/jira/browse/SPARK-42574 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Assignee: Apache Spark >Priority: Major > > {code:python} > spark.sql("select 1 v, 1 v").toPandas() > {code} > should work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42574) DataFrame.toPandas should handle duplicated column names
[ https://issues.apache.org/jira/browse/SPARK-42574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693405#comment-17693405 ] Apache Spark commented on SPARK-42574: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/40170 > DataFrame.toPandas should handle duplicated column names > > > Key: SPARK-42574 > URL: https://issues.apache.org/jira/browse/SPARK-42574 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > {code:python} > spark.sql("select 1 v, 1 v").toPandas() > {code} > should work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42574) DataFrame.toPandas should handle duplicated column names
[ https://issues.apache.org/jira/browse/SPARK-42574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42574: Assignee: (was: Apache Spark) > DataFrame.toPandas should handle duplicated column names > > > Key: SPARK-42574 > URL: https://issues.apache.org/jira/browse/SPARK-42574 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > {code:python} > spark.sql("select 1 v, 1 v").toPandas() > {code} > should work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42575) Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests
[ https://issues.apache.org/jira/browse/SPARK-42575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42575: Assignee: (was: Apache Spark) > Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests > --- > > Key: SPARK-42575 > URL: https://issues.apache.org/jira/browse/SPARK-42575 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Minor > > Make enginner's life easier. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42575) Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests
[ https://issues.apache.org/jira/browse/SPARK-42575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42575: Assignee: Apache Spark > Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests > --- > > Key: SPARK-42575 > URL: https://issues.apache.org/jira/browse/SPARK-42575 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Assignee: Apache Spark >Priority: Minor > > Make enginner's life easier. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42575) Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests
[ https://issues.apache.org/jira/browse/SPARK-42575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693404#comment-17693404 ] Apache Spark commented on SPARK-42575: -- User 'zhenlineo' has created a pull request for this issue: https://github.com/apache/spark/pull/40169 > Replace `AnyFunSuite` with `ConnectFunSuite` for scala client tests > --- > > Key: SPARK-42575 > URL: https://issues.apache.org/jira/browse/SPARK-42575 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Minor > > Make enginner's life easier. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42573) Enable binary compatibility tests for SparkSession/Dataset/Column/functions
[ https://issues.apache.org/jira/browse/SPARK-42573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693397#comment-17693397 ] Apache Spark commented on SPARK-42573: -- User 'zhenlineo' has created a pull request for this issue: https://github.com/apache/spark/pull/40168 > Enable binary compatibility tests for SparkSession/Dataset/Column/functions > --- > > Key: SPARK-42573 > URL: https://issues.apache.org/jira/browse/SPARK-42573 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42573) Enable binary compatibility tests for SparkSession/Dataset/Column/functions
[ https://issues.apache.org/jira/browse/SPARK-42573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42573: Assignee: Apache Spark > Enable binary compatibility tests for SparkSession/Dataset/Column/functions > --- > > Key: SPARK-42573 > URL: https://issues.apache.org/jira/browse/SPARK-42573 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42573) Enable binary compatibility tests for SparkSession/Dataset/Column/functions
[ https://issues.apache.org/jira/browse/SPARK-42573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42573: Assignee: (was: Apache Spark) > Enable binary compatibility tests for SparkSession/Dataset/Column/functions > --- > > Key: SPARK-42573 > URL: https://issues.apache.org/jira/browse/SPARK-42573 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42561) Add TempView APIs to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42561: Assignee: Apache Spark (was: Rui Wang) > Add TempView APIs to Dataset > > > Key: SPARK-42561 > URL: https://issues.apache.org/jira/browse/SPARK-42561 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > Add all temp view APIs to connect's Dataset. > {code:java} > /** > * Registers this Dataset as a temporary table using the given name. The > lifetime of this > * temporary table is tied to the [[SparkSession]] that was used to create > this Dataset. > * > * @group basic > * @since 1.6.0 > */ > @deprecated("Use createOrReplaceTempView(viewName) instead.", "2.0.0") > def registerTempTable(tableName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * Local temporary view is session-scoped. Its lifetime is the lifetime of the > session that > * created it, i.e. it will be automatically dropped when the session > terminates. It's not > * tied to any databases, i.e. we can't use `db1.view1` to reference a local > temporary view. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.0.0 > */ > @throws[AnalysisException] > def createTempView(viewName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * @group basic > * @since 2.0.0 > */ > def createOrReplaceTempView(viewName: String): Unit > /** > * Creates a global temporary view using the given name. The lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.1.0 > */ > @throws[AnalysisException] > def createGlobalTempView(viewName: String): Unit > /** > * Creates or replaces a global temporary view using the given name. The > lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @group basic > * @since 2.2.0 > */ > def createOrReplaceGlobalTempView(viewName: String): Unit {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42561) Add TempView APIs to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693388#comment-17693388 ] Apache Spark commented on SPARK-42561: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40167 > Add TempView APIs to Dataset > > > Key: SPARK-42561 > URL: https://issues.apache.org/jira/browse/SPARK-42561 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Rui Wang >Priority: Major > > Add all temp view APIs to connect's Dataset. > {code:java} > /** > * Registers this Dataset as a temporary table using the given name. The > lifetime of this > * temporary table is tied to the [[SparkSession]] that was used to create > this Dataset. > * > * @group basic > * @since 1.6.0 > */ > @deprecated("Use createOrReplaceTempView(viewName) instead.", "2.0.0") > def registerTempTable(tableName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * Local temporary view is session-scoped. Its lifetime is the lifetime of the > session that > * created it, i.e. it will be automatically dropped when the session > terminates. It's not > * tied to any databases, i.e. we can't use `db1.view1` to reference a local > temporary view. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.0.0 > */ > @throws[AnalysisException] > def createTempView(viewName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * @group basic > * @since 2.0.0 > */ > def createOrReplaceTempView(viewName: String): Unit > /** > * Creates a global temporary view using the given name. The lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.1.0 > */ > @throws[AnalysisException] > def createGlobalTempView(viewName: String): Unit > /** > * Creates or replaces a global temporary view using the given name. The > lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @group basic > * @since 2.2.0 > */ > def createOrReplaceGlobalTempView(viewName: String): Unit {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42561) Add TempView APIs to Dataset
[ https://issues.apache.org/jira/browse/SPARK-42561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42561: Assignee: Rui Wang (was: Apache Spark) > Add TempView APIs to Dataset > > > Key: SPARK-42561 > URL: https://issues.apache.org/jira/browse/SPARK-42561 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Rui Wang >Priority: Major > > Add all temp view APIs to connect's Dataset. > {code:java} > /** > * Registers this Dataset as a temporary table using the given name. The > lifetime of this > * temporary table is tied to the [[SparkSession]] that was used to create > this Dataset. > * > * @group basic > * @since 1.6.0 > */ > @deprecated("Use createOrReplaceTempView(viewName) instead.", "2.0.0") > def registerTempTable(tableName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * Local temporary view is session-scoped. Its lifetime is the lifetime of the > session that > * created it, i.e. it will be automatically dropped when the session > terminates. It's not > * tied to any databases, i.e. we can't use `db1.view1` to reference a local > temporary view. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.0.0 > */ > @throws[AnalysisException] > def createTempView(viewName: String): Unit > /** > * Creates a local temporary view using the given name. The lifetime of this > * temporary view is tied to the [[SparkSession]] that was used to create this > Dataset. > * > * @group basic > * @since 2.0.0 > */ > def createOrReplaceTempView(viewName: String): Unit > /** > * Creates a global temporary view using the given name. The lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @throws AnalysisException if the view name is invalid or already exists > * > * @group basic > * @since 2.1.0 > */ > @throws[AnalysisException] > def createGlobalTempView(viewName: String): Unit > /** > * Creates or replaces a global temporary view using the given name. The > lifetime of this > * temporary view is tied to this Spark application. > * > * Global temporary view is cross-session. Its lifetime is the lifetime of the > Spark application, > * i.e. it will be automatically dropped when the application terminates. It's > tied to a system > * preserved database `global_temp`, and we must use the qualified name to > refer a global temp > * view, e.g. `SELECT * FROM global_temp.view1`. > * > * @group basic > * @since 2.2.0 > */ > def createOrReplaceGlobalTempView(viewName: String): Unit {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42570) Fix DataFrameReader to use the default source
[ https://issues.apache.org/jira/browse/SPARK-42570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693383#comment-17693383 ] Apache Spark commented on SPARK-42570: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/40166 > Fix DataFrameReader to use the default source > - > > Key: SPARK-42570 > URL: https://issues.apache.org/jira/browse/SPARK-42570 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > {code:python} > spark.read.load(path) > {code} > should work without specifying the format. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42570) Fix DataFrameReader to use the default source
[ https://issues.apache.org/jira/browse/SPARK-42570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42570: Assignee: (was: Apache Spark) > Fix DataFrameReader to use the default source > - > > Key: SPARK-42570 > URL: https://issues.apache.org/jira/browse/SPARK-42570 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > {code:python} > spark.read.load(path) > {code} > should work without specifying the format. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42570) Fix DataFrameReader to use the default source
[ https://issues.apache.org/jira/browse/SPARK-42570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42570: Assignee: Apache Spark > Fix DataFrameReader to use the default source > - > > Key: SPARK-42570 > URL: https://issues.apache.org/jira/browse/SPARK-42570 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Assignee: Apache Spark >Priority: Major > > {code:python} > spark.read.load(path) > {code} > should work without specifying the format. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42568) SparkConnectStreamHandler should manage configs properly while creating plans.
[ https://issues.apache.org/jira/browse/SPARK-42568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42568: Assignee: Apache Spark > SparkConnectStreamHandler should manage configs properly while creating plans. > -- > > Key: SPARK-42568 > URL: https://issues.apache.org/jira/browse/SPARK-42568 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Assignee: Apache Spark >Priority: Major > > Some components for planning need to check configs in {{SQLConf.get}} while > building the plan, but currently it's unavailable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693362#comment-17693362 ] Apache Spark commented on SPARK-42569: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40164 > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42568) SparkConnectStreamHandler should manage configs properly while creating plans.
[ https://issues.apache.org/jira/browse/SPARK-42568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42568: Assignee: (was: Apache Spark) > SparkConnectStreamHandler should manage configs properly while creating plans. > -- > > Key: SPARK-42568 > URL: https://issues.apache.org/jira/browse/SPARK-42568 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > Some components for planning need to check configs in {{SQLConf.get}} while > building the plan, but currently it's unavailable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42569: Assignee: Rui Wang (was: Apache Spark) > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42569) Throw unsupported exceptions for non-supported API
[ https://issues.apache.org/jira/browse/SPARK-42569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42569: Assignee: Apache Spark (was: Rui Wang) > Throw unsupported exceptions for non-supported API > -- > > Key: SPARK-42569 > URL: https://issues.apache.org/jira/browse/SPARK-42569 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42568) SparkConnectStreamHandler should manage configs properly while creating plans.
[ https://issues.apache.org/jira/browse/SPARK-42568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693363#comment-17693363 ] Apache Spark commented on SPARK-42568: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/40165 > SparkConnectStreamHandler should manage configs properly while creating plans. > -- > > Key: SPARK-42568 > URL: https://issues.apache.org/jira/browse/SPARK-42568 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Priority: Major > > Some components for planning need to check configs in {{SQLConf.get}} while > building the plan, but currently it's unavailable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42567) Track state store provider load time and log warning if it exceeds a threshold
[ https://issues.apache.org/jira/browse/SPARK-42567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42567: Assignee: (was: Apache Spark) > Track state store provider load time and log warning if it exceeds a threshold > -- > > Key: SPARK-42567 > URL: https://issues.apache.org/jira/browse/SPARK-42567 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 3.4.1 >Reporter: Anish Shrigondekar >Priority: Major > > Track state store provider load time and log warning if it exceeds a threshold > > In some cases, we see that the filesystem initialization might take time for > the first time that we create the provider and initialize it. This change > will log the time taken if it exceeds a certain threshold -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42567) Track state store provider load time and log warning if it exceeds a threshold
[ https://issues.apache.org/jira/browse/SPARK-42567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42567: Assignee: Apache Spark > Track state store provider load time and log warning if it exceeds a threshold > -- > > Key: SPARK-42567 > URL: https://issues.apache.org/jira/browse/SPARK-42567 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 3.4.1 >Reporter: Anish Shrigondekar >Assignee: Apache Spark >Priority: Major > > Track state store provider load time and log warning if it exceeds a threshold > > In some cases, we see that the filesystem initialization might take time for > the first time that we create the provider and initialize it. This change > will log the time taken if it exceeds a certain threshold -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org