[jira] [Created] (SPARK-40766) Upgrade the guava defined in `plugins.sbt` to `31.0.1-jre`
Yang Jie created SPARK-40766: Summary: Upgrade the guava defined in `plugins.sbt` to `31.0.1-jre` Key: SPARK-40766 URL: https://issues.apache.org/jira/browse/SPARK-40766 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.4.0 Reporter: Yang Jie SPARK-40071 upgrade checkstyle to 9.3 and checkstyle 9.3uses guava 31.0.1-jre -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38774) impl Series.autocorr
[ https://issues.apache.org/jira/browse/SPARK-38774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616201#comment-17616201 ] Apache Spark commented on SPARK-38774: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/38216 > impl Series.autocorr > > > Key: SPARK-38774 > URL: https://issues.apache.org/jira/browse/SPARK-38774 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38774) impl Series.autocorr
[ https://issues.apache.org/jira/browse/SPARK-38774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616200#comment-17616200 ] Apache Spark commented on SPARK-38774: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/38216 > impl Series.autocorr > > > Key: SPARK-38774 > URL: https://issues.apache.org/jira/browse/SPARK-38774 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40654) Protobuf support MVP with descriptor files
[ https://issues.apache.org/jira/browse/SPARK-40654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-40654: -- Assignee: Sandish Kumar HN > Protobuf support MVP with descriptor files > -- > > Key: SPARK-40654 > URL: https://issues.apache.org/jira/browse/SPARK-40654 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: Raghu Angadi >Assignee: Sandish Kumar HN >Priority: Major > Fix For: 3.4.0 > > > This is the MVP implementation of protobuf support with descriptor files. > Currently in PR https://github.com/apache/spark/pull/37972 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40654) Protobuf support MVP with descriptor files
[ https://issues.apache.org/jira/browse/SPARK-40654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-40654: --- Affects Version/s: 3.4.0 (was: 3.3.0) > Protobuf support MVP with descriptor files > -- > > Key: SPARK-40654 > URL: https://issues.apache.org/jira/browse/SPARK-40654 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.4.0 >Reporter: Raghu Angadi >Priority: Major > Fix For: 3.4.0 > > > This is the MVP implementation of protobuf support with descriptor files. > Currently in PR https://github.com/apache/spark/pull/37972 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40765) Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
[ https://issues.apache.org/jira/browse/SPARK-40765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616171#comment-17616171 ] Apache Spark commented on SPARK-40765: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/38214 > Optimize redundant fs operations in > `CommandUtils#calculateSingleLocationSize#getPathSize` method > - > > Key: SPARK-40765 > URL: https://issues.apache.org/jira/browse/SPARK-40765 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > {code:java} > def getPathSize(fs: FileSystem, path: Path): Long = { > val fileStatus = fs.getFileStatus(path) > val size = if (fileStatus.isDirectory) { > fs.listStatus(path) > .map { status => > if (isDataPath(status.getPath, stagingDir)) { > getPathSize(fs, status.getPath) > } else { > 0L > } > }.sum > } else { > fileStatus.getLen > } size > } {code} > Change input parameter from `Path` to `FileStatus`, there is no need to do > `fs.getFileStatus(path)` after each recursive call. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40765) Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
[ https://issues.apache.org/jira/browse/SPARK-40765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40765: Assignee: Apache Spark > Optimize redundant fs operations in > `CommandUtils#calculateSingleLocationSize#getPathSize` method > - > > Key: SPARK-40765 > URL: https://issues.apache.org/jira/browse/SPARK-40765 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > {code:java} > def getPathSize(fs: FileSystem, path: Path): Long = { > val fileStatus = fs.getFileStatus(path) > val size = if (fileStatus.isDirectory) { > fs.listStatus(path) > .map { status => > if (isDataPath(status.getPath, stagingDir)) { > getPathSize(fs, status.getPath) > } else { > 0L > } > }.sum > } else { > fileStatus.getLen > } size > } {code} > Change input parameter from `Path` to `FileStatus`, there is no need to do > `fs.getFileStatus(path)` after each recursive call. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40765) Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
[ https://issues.apache.org/jira/browse/SPARK-40765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40765: Assignee: (was: Apache Spark) > Optimize redundant fs operations in > `CommandUtils#calculateSingleLocationSize#getPathSize` method > - > > Key: SPARK-40765 > URL: https://issues.apache.org/jira/browse/SPARK-40765 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Minor > > {code:java} > def getPathSize(fs: FileSystem, path: Path): Long = { > val fileStatus = fs.getFileStatus(path) > val size = if (fileStatus.isDirectory) { > fs.listStatus(path) > .map { status => > if (isDataPath(status.getPath, stagingDir)) { > getPathSize(fs, status.getPath) > } else { > 0L > } > }.sum > } else { > fileStatus.getLen > } size > } {code} > Change input parameter from `Path` to `FileStatus`, there is no need to do > `fs.getFileStatus(path)` after each recursive call. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40765) Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
Yang Jie created SPARK-40765: Summary: Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method Key: SPARK-40765 URL: https://issues.apache.org/jira/browse/SPARK-40765 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Yang Jie {code:java} def getPathSize(fs: FileSystem, path: Path): Long = { val fileStatus = fs.getFileStatus(path) val size = if (fileStatus.isDirectory) { fs.listStatus(path) .map { status => if (isDataPath(status.getPath, stagingDir)) { getPathSize(fs, status.getPath) } else { 0L } }.sum } else { fileStatus.getLen } size } {code} Change input parameter from `Path` to `FileStatus`, there is no need to do `fs.getFileStatus(path)` after each recursive call. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40704) Pyspark incompatible with Pypy
[ https://issues.apache.org/jira/browse/SPARK-40704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616168#comment-17616168 ] Hyukjin Kwon commented on SPARK-40704: -- [~ggbaker] Please go ahead and upgrade cloudpickle. See also https://github.com/apache/spark/pull/34705 > Pyspark incompatible with Pypy > -- > > Key: SPARK-40704 > URL: https://issues.apache.org/jira/browse/SPARK-40704 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Greg Baker >Priority: Major > > Starting Spark with a recent Pypy (>3.6) fails because of an incompatibility > between their pickle implementation and cloudpickle: > {quote}{{% PYSPARK_PYTHON=pypy3 ./bin/pyspark}} > {{...}} > {{ModuleNotFoundError: No module named '_pickle'}} > {quote} > > It seems to be related to [this cloudpickle > issue|https://github.com/cloudpipe/cloudpickle/issues/455], which has been > fixed upstream. I was able to work around by replacing the Spark-provided > cloudpickle (python/pyspark/cloudpickle) with the code from their git repo > (and deleting pyspark.zip to purge that copy). > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40764) Extract partitioning through all children output expressions
Yuming Wang created SPARK-40764: --- Summary: Extract partitioning through all children output expressions Key: SPARK-40764 URL: https://issues.apache.org/jira/browse/SPARK-40764 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Yuming Wang {code:sql} WITH web_tv as ( select ws_item_sk item_sk, d_date, sum(ws_sales_price) sumws, row_number() over (partition by ws_item_sk order by d_date) rk from web_sales, date_dim where ws_sold_date_sk=d_date_sk and d_month_seq between 1212 and 1212 + 11 and ws_item_sk is not NULL group by ws_item_sk, d_date), web_v1 as ( select v1.item_sk, v1.d_date, v1.sumws, sum(v2.sumws) cume_sales from web_tv v1, web_tv v2 where v1.item_sk = v2.item_sk and v1.rk >= v2.rk group by v1.item_sk, v1.d_date, v1.sumws) select * from web_v1 {code} Before: {noformat} == Physical Plan == *(13) HashAggregate(keys=[item_sk#1, d_date#2, sumws#3], functions=[sum(sumws#4)], output=[item_sk#1, d_date#2, sumws#3, cume_sales#5]) +- *(13) HashAggregate(keys=[item_sk#1, d_date#2, sumws#3], functions=[partial_sum(sumws#4)], output=[item_sk#1, d_date#2, sumws#3, sum#132, isEmpty#133]) +- *(13) Project [item_sk#1, d_date#2, sumws#3, sumws#4] +- *(13) SortMergeJoin [item_sk#1], [item_sk#6], Inner, (rk#7 >= rk#8) :- *(6) Sort [item_sk#1 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(item_sk#1, 5), ENSURE_REQUIREMENTS, [plan_id=1] : +- *(5) Project [item_sk#1, d_date#2, sumws#3, rk#7] :+- Window [row_number() windowspecdefinition(ws_item_sk#9, d_date#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#7], [ws_item_sk#9], [d_date#2 ASC NULLS FIRST] : +- *(4) Sort [ws_item_sk#9 ASC NULLS FIRST, d_date#2 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(ws_item_sk#9, 5), ENSURE_REQUIREMENTS, [plan_id=2] : +- *(3) HashAggregate(keys=[ws_item_sk#9, d_date#2], functions=[sum(UnscaledValue(ws_sales_price#10))], output=[item_sk#1, d_date#2, sumws#3, ws_item_sk#9]) :+- Exchange hashpartitioning(ws_item_sk#9, d_date#2, 5), ENSURE_REQUIREMENTS, [plan_id=3] : +- *(2) HashAggregate(keys=[ws_item_sk#9, d_date#2], functions=[partial_sum(UnscaledValue(ws_sales_price#10))], output=[ws_item_sk#9, d_date#2, sum#134]) : +- *(2) Project [ws_item_sk#9, ws_sales_price#10, d_date#2] : +- *(2) BroadcastHashJoin [ws_sold_date_sk#11], [d_date_sk#12], Inner, BuildRight, false ::- *(2) Filter isnotnull(ws_item_sk#9) :: +- *(2) ColumnarToRow :: +- FileScan parquet spark_catalog.default.web_sales[ws_item_sk#9,ws_sales_price#10,ws_sold_date_sk#11] Batched: true, DataFilters: [isnotnull(ws_item_sk#9)], Format: Parquet, Location: InMemoryFileIndex(0 paths)[], PartitionFilters: [isnotnull(ws_sold_date_sk#11)], PushedFilters: [IsNotNull(ws_item_sk)], ReadSchema: struct :+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=4] : +- *(1) Project [d_date_sk#12, d_date#2] : +- *(1) Filter (((isnotnull(d_month_seq#44) AND (d_month_seq#44 >= 1212)) AND (d_month_seq#44 <= 1223)) AND isnotnull(d_date_sk#12)) : +- *(1) ColumnarToRow :+- FileScan parquet spark_catalog.default.date_dim[d_date_sk#12,d_date#2,d_month_seq#44] Batched: true, DataFilters: [isnotnull(d_month_seq#44), (d_month_seq#44 >= 1212), (d_month_seq#44 <= 1223), isnotnull(d_date_..., Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/yumwang/spark/parquet-1.13/launcher/spark-warehouse/org.ap..., PartitionFilters: [], PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1212), LessThanOrEqual(d_month_seq,1223),..., ReadSchema: struct +- *(12) Sort [item_sk#6 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(item_sk#6, 5), ENSURE_REQUIREMENTS, [plan_id=5] +- *(11) Project [item_sk#1 AS item_sk#6, sumws#3 AS sumws#4, rk#8] +- Window [row_number() windowspecdefinition(ws_item_sk#70, d_date#71 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#8], [ws_item_sk#70], [d_date#71 ASC NULLS
[jira] [Resolved] (SPARK-40722) How to set BlockManager info of hostname as ipaddress
[ https://issues.apache.org/jira/browse/SPARK-40722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40722. -- Resolution: Invalid > How to set BlockManager info of hostname as ipaddress > -- > > Key: SPARK-40722 > URL: https://issues.apache.org/jira/browse/SPARK-40722 > Project: Spark > Issue Type: Question > Components: Block Manager >Affects Versions: 2.4.3 >Reporter: Chen Xia >Priority: Major > > > {code:java} > 2022-10-09 17:22:42.517 [INFO ] [YARN application state monitor ] > o.a.s.u.SparkUI (54) [logInfo] - Stopped Spark web UI at > http://linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.local:4040 > 2022-10-09 17:46:09.854 [INFO ] [main ] > o.a.s.s.BlockManager (54) [logInfo] - Initialized BlockManager: > BlockManagerId(driver, > linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.local, > 38798, None) > {code} > I want to repleace > canonicalHostName(linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.loca) > with ipaddress such as 10.10.10.10 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40722) How to set BlockManager info of hostname as ipaddress
[ https://issues.apache.org/jira/browse/SPARK-40722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616167#comment-17616167 ] Hyukjin Kwon commented on SPARK-40722: -- [~casion] for questions, it's better to interact in Spark dev mailing list. let's discuss there before filing it as an issue. > How to set BlockManager info of hostname as ipaddress > -- > > Key: SPARK-40722 > URL: https://issues.apache.org/jira/browse/SPARK-40722 > Project: Spark > Issue Type: Question > Components: Block Manager >Affects Versions: 2.4.3 >Reporter: Chen Xia >Priority: Major > > > {code:java} > 2022-10-09 17:22:42.517 [INFO ] [YARN application state monitor ] > o.a.s.u.SparkUI (54) [logInfo] - Stopped Spark web UI at > http://linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.local:4040 > 2022-10-09 17:46:09.854 [INFO ] [main ] > o.a.s.s.BlockManager (54) [logInfo] - Initialized BlockManager: > BlockManagerId(driver, > linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.local, > 38798, None) > {code} > I want to repleace > canonicalHostName(linkis-demo-cg-engineconnmanager-76778ff4b5-sf9xz.linkis-demo-cg-engineconnmanager-headless.linkis.svc.cluster.loca) > with ipaddress such as 10.10.10.10 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40736) Spark 3.3.0 doesn't works with Hive 3.1.2
[ https://issues.apache.org/jira/browse/SPARK-40736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40736: - Component/s: SQL (was: Spark Core) > Spark 3.3.0 doesn't works with Hive 3.1.2 > - > > Key: SPARK-40736 > URL: https://issues.apache.org/jira/browse/SPARK-40736 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Pratik Malani >Priority: Major > Labels: Hive, spark > > Hive 2.3.9 is impacted with CVE-2021-34538, so trying to use the Hive 3.1.2. > Using Spark 3.3.0 with Hadoop 3.3.4 and Hive 3.1.2, getting below error when > starting the Thriftserver > > {noformat} > Exception in thread "main" java.lang.IllegalAccessError: tried to access > class org.apache.hive.service.server.HiveServer2$ServerOptionsProcessor from > class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$ > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:92) > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at > org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala){noformat} > Using below command to start the Thriftserver > > *spark-class org.apache.spark.deploy.SparkSubmit --class > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal* > > Have set the SPARK_HOME correctly. > > The same works well with Hive 2.3.9, but fails when we upgrade to Hive 3.1.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40736) Spark 3.3.0 doesn't works with Hive 3.1.2
[ https://issues.apache.org/jira/browse/SPARK-40736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40736: - Labels: Hive spark (was: Hive spark spark3.0) > Spark 3.3.0 doesn't works with Hive 3.1.2 > - > > Key: SPARK-40736 > URL: https://issues.apache.org/jira/browse/SPARK-40736 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Pratik Malani >Priority: Major > Labels: Hive, spark > > Hive 2.3.9 is impacted with CVE-2021-34538, so trying to use the Hive 3.1.2. > Using Spark 3.3.0 with Hadoop 3.3.4 and Hive 3.1.2, getting below error when > starting the Thriftserver > > {noformat} > Exception in thread "main" java.lang.IllegalAccessError: tried to access > class org.apache.hive.service.server.HiveServer2$ServerOptionsProcessor from > class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$ > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:92) > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at > org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala){noformat} > Using below command to start the Thriftserver > > *spark-class org.apache.spark.deploy.SparkSubmit --class > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal* > > Have set the SPARK_HOME correctly. > > The same works well with Hive 2.3.9, but fails when we upgrade to Hive 3.1.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40736) Spark 3.3.0 doesn't works with Hive 3.1.2
[ https://issues.apache.org/jira/browse/SPARK-40736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-40736: - Fix Version/s: (was: 3.3.1) > Spark 3.3.0 doesn't works with Hive 3.1.2 > - > > Key: SPARK-40736 > URL: https://issues.apache.org/jira/browse/SPARK-40736 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Pratik Malani >Priority: Major > Labels: Hive, spark, spark3.0 > > Hive 2.3.9 is impacted with CVE-2021-34538, so trying to use the Hive 3.1.2. > Using Spark 3.3.0 with Hadoop 3.3.4 and Hive 3.1.2, getting below error when > starting the Thriftserver > > {noformat} > Exception in thread "main" java.lang.IllegalAccessError: tried to access > class org.apache.hive.service.server.HiveServer2$ServerOptionsProcessor from > class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$ > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:92) > at > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at > org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala){noformat} > Using below command to start the Thriftserver > > *spark-class org.apache.spark.deploy.SparkSubmit --class > org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal* > > Have set the SPARK_HOME correctly. > > The same works well with Hive 2.3.9, but fails when we upgrade to Hive 3.1.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40741) spark项目bin/beeline对于distribute by sort by语句支持不好,输出结果错误
[ https://issues.apache.org/jira/browse/SPARK-40741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-40741. -- Resolution: Incomplete > spark项目bin/beeline对于distribute by sort by语句支持不好,输出结果错误 > -- > > Key: SPARK-40741 > URL: https://issues.apache.org/jira/browse/SPARK-40741 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0 > Environment: spark 3.1 > hive 3.0 >Reporter: kaiqingli >Priority: Major > > sql中使用distribute by ... sort by > ...时,通过spark/bin/beeline执行的结果错误,使用hive/beeline输出结果正确,具体场景为,先基于posexplode拆分array数据,然后基于拆分的下标进行sort > by,之后再collect list,结果与原始的array结果不一致,sql如下: > select id, > samplingtimesec, > array_data = new_array_data flag, > array_data, > new_array_data > from ( > select id, > samplingtimesec, > array_data, > concat('[', concat_ws(',', collect_list(cell_voltage)), ']') new_array_data > from ( > select id, samplingtimesec, array_data, cell_index, cell_voltage > from ( > select id, > samplingtimesec, > array_data,--格式[1,2,3,4,5] > row_number() over (partition by id,samplingtimesec order by samplingtimesec) > r --去重 > from table > WHERE dt = '20221007' > and samplingtimesec <= 166507920 > ) tmp > lateral view posexplode(split(replace(replace(array_data, '[', ''), ']', ''), > ',')) v0 as cell_index, cell_voltage > where r = 1 > distribute by id > , samplingtimesec sort by cell_index > ) tmp > group by id, samplingtimesec, array_data > ) tmp > where array_data != new_array_data; > 以上sql,对于hive/beeline输出结果为0条; > 对于spark/beeline输出结果不为0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40741) spark项目bin/beeline对于distribute by sort by语句支持不好,输出结果错误
[ https://issues.apache.org/jira/browse/SPARK-40741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616166#comment-17616166 ] Hyukjin Kwon commented on SPARK-40741: -- [~lkqqingcao] would be great to file an issue in English because there are many maintainers who don't speak other languages. > spark项目bin/beeline对于distribute by sort by语句支持不好,输出结果错误 > -- > > Key: SPARK-40741 > URL: https://issues.apache.org/jira/browse/SPARK-40741 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0 > Environment: spark 3.1 > hive 3.0 >Reporter: kaiqingli >Priority: Major > > sql中使用distribute by ... sort by > ...时,通过spark/bin/beeline执行的结果错误,使用hive/beeline输出结果正确,具体场景为,先基于posexplode拆分array数据,然后基于拆分的下标进行sort > by,之后再collect list,结果与原始的array结果不一致,sql如下: > select id, > samplingtimesec, > array_data = new_array_data flag, > array_data, > new_array_data > from ( > select id, > samplingtimesec, > array_data, > concat('[', concat_ws(',', collect_list(cell_voltage)), ']') new_array_data > from ( > select id, samplingtimesec, array_data, cell_index, cell_voltage > from ( > select id, > samplingtimesec, > array_data,--格式[1,2,3,4,5] > row_number() over (partition by id,samplingtimesec order by samplingtimesec) > r --去重 > from table > WHERE dt = '20221007' > and samplingtimesec <= 166507920 > ) tmp > lateral view posexplode(split(replace(replace(array_data, '[', ''), ']', ''), > ',')) v0 as cell_index, cell_voltage > where r = 1 > distribute by id > , samplingtimesec sort by cell_index > ) tmp > group by id, samplingtimesec, array_data > ) tmp > where array_data != new_array_data; > 以上sql,对于hive/beeline输出结果为0条; > 对于spark/beeline输出结果不为0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-8731) Beeline doesn't work with -e option when started in background
[ https://issues.apache.org/jira/browse/SPARK-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reassigned SPARK-8731: --- Assignee: Apache Spark (was: Kent Yao) > Beeline doesn't work with -e option when started in background > -- > > Key: SPARK-8731 > URL: https://issues.apache.org/jira/browse/SPARK-8731 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.0 >Reporter: Wang Yiguang >Assignee: Apache Spark >Priority: Minor > Fix For: 3.4.0, 3.2.3, 3.3.2 > > > Beeline stops when running back ground like this: > beeline -e "some query" & > it doesn't work even with the -f switch. > For example: > this works: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" > however this not: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" & -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-8731) Beeline doesn't work with -e option when started in background
[ https://issues.apache.org/jira/browse/SPARK-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-8731. - Fix Version/s: 3.4.0 3.2.3 3.3.2 Resolution: Fixed > Beeline doesn't work with -e option when started in background > -- > > Key: SPARK-8731 > URL: https://issues.apache.org/jira/browse/SPARK-8731 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.0 >Reporter: Wang Yiguang >Assignee: Kent Yao >Priority: Minor > Fix For: 3.4.0, 3.2.3, 3.3.2 > > > Beeline stops when running back ground like this: > beeline -e "some query" & > it doesn't work even with the -f switch. > For example: > this works: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" > however this not: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" & -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-8731) Beeline doesn't work with -e option when started in background
[ https://issues.apache.org/jira/browse/SPARK-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reopened SPARK-8731: - Assignee: Kent Yao > Beeline doesn't work with -e option when started in background > -- > > Key: SPARK-8731 > URL: https://issues.apache.org/jira/browse/SPARK-8731 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.0 >Reporter: Wang Yiguang >Assignee: Kent Yao >Priority: Minor > > Beeline stops when running back ground like this: > beeline -e "some query" & > it doesn't work even with the -f switch. > For example: > this works: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" > however this not: > beeline -u "jdbc:hive2://0.0.0.0:8000" -e "show databases;" & -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40655) Protobuf functions in Python
[ https://issues.apache.org/jira/browse/SPARK-40655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616155#comment-17616155 ] Apache Spark commented on SPARK-40655: -- User 'SandishKumarHN' has created a pull request for this issue: https://github.com/apache/spark/pull/38212 > Protobuf functions in Python > - > > Key: SPARK-40655 > URL: https://issues.apache.org/jira/browse/SPARK-40655 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.3.0 >Reporter: Raghu Angadi >Priority: Major > > Add Python support for Protobuf functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40655) Protobuf functions in Python
[ https://issues.apache.org/jira/browse/SPARK-40655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616156#comment-17616156 ] Apache Spark commented on SPARK-40655: -- User 'SandishKumarHN' has created a pull request for this issue: https://github.com/apache/spark/pull/38212 > Protobuf functions in Python > - > > Key: SPARK-40655 > URL: https://issues.apache.org/jira/browse/SPARK-40655 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.3.0 >Reporter: Raghu Angadi >Priority: Major > > Add Python support for Protobuf functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40746) Make Dockerfile build workflow work in apache repo
[ https://issues.apache.org/jira/browse/SPARK-40746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang updated SPARK-40746: Summary: Make Dockerfile build workflow work in apache repo (was: Switch workflow trigger event to `pull_request_target`) > Make Dockerfile build workflow work in apache repo > -- > > Key: SPARK-40746 > URL: https://issues.apache.org/jira/browse/SPARK-40746 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40746) Switch workflow trigger event to `pull_request_target`
[ https://issues.apache.org/jira/browse/SPARK-40746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang resolved SPARK-40746. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 7 [https://github.com/apache/spark-docker/pull/7] > Switch workflow trigger event to `pull_request_target` > -- > > Key: SPARK-40746 > URL: https://issues.apache.org/jira/browse/SPARK-40746 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40746) Switch workflow trigger event to `pull_request_target`
[ https://issues.apache.org/jira/browse/SPARK-40746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang reassigned SPARK-40746: --- Assignee: Yikun Jiang > Switch workflow trigger event to `pull_request_target` > -- > > Key: SPARK-40746 > URL: https://issues.apache.org/jira/browse/SPARK-40746 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40757) Add PULL_REQUEST_TEMPLATE for spark-docker
[ https://issues.apache.org/jira/browse/SPARK-40757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang resolved SPARK-40757. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 8 [https://github.com/apache/spark-docker/pull/8] > Add PULL_REQUEST_TEMPLATE for spark-docker > -- > > Key: SPARK-40757 > URL: https://issues.apache.org/jira/browse/SPARK-40757 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40757) Add PULL_REQUEST_TEMPLATE for spark-docker
[ https://issues.apache.org/jira/browse/SPARK-40757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang reassigned SPARK-40757: --- Assignee: Yikun Jiang > Add PULL_REQUEST_TEMPLATE for spark-docker > -- > > Key: SPARK-40757 > URL: https://issues.apache.org/jira/browse/SPARK-40757 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.4.0 >Reporter: Yikun Jiang >Assignee: Yikun Jiang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40654) Protobuf support MVP with descriptor files
[ https://issues.apache.org/jira/browse/SPARK-40654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-40654. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37972 [https://github.com/apache/spark/pull/37972] > Protobuf support MVP with descriptor files > -- > > Key: SPARK-40654 > URL: https://issues.apache.org/jira/browse/SPARK-40654 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.3.0 >Reporter: Raghu Angadi >Priority: Major > Fix For: 3.4.0 > > > This is the MVP implementation of protobuf support with descriptor files. > Currently in PR https://github.com/apache/spark/pull/37972 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40763) Should expose driver service name to config for user features
[ https://issues.apache.org/jira/browse/SPARK-40763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40763: Assignee: (was: Apache Spark) > Should expose driver service name to config for user features > - > > Key: SPARK-40763 > URL: https://issues.apache.org/jira/browse/SPARK-40763 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.3.0 >Reporter: binjie yang >Priority: Minor > > Current on kubernetes, user's feature step, which build user's kubernetes > resource during spark submit spark pod, can't percept some spark resource > info, such as spark driver service name. > > User may want to expose some spark pod info to build their custom resource, > such as ingress, etc. > > We want the way expose now spark driver service name, which is now generated > by clock and uuid. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40763) Should expose driver service name to config for user features
[ https://issues.apache.org/jira/browse/SPARK-40763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40763: Assignee: Apache Spark > Should expose driver service name to config for user features > - > > Key: SPARK-40763 > URL: https://issues.apache.org/jira/browse/SPARK-40763 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.3.0 >Reporter: binjie yang >Assignee: Apache Spark >Priority: Minor > > Current on kubernetes, user's feature step, which build user's kubernetes > resource during spark submit spark pod, can't percept some spark resource > info, such as spark driver service name. > > User may want to expose some spark pod info to build their custom resource, > such as ingress, etc. > > We want the way expose now spark driver service name, which is now generated > by clock and uuid. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40763) Should expose driver service name to config for user features
binjie yang created SPARK-40763: --- Summary: Should expose driver service name to config for user features Key: SPARK-40763 URL: https://issues.apache.org/jira/browse/SPARK-40763 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.3.0 Reporter: binjie yang Current on kubernetes, user's feature step, which build user's kubernetes resource during spark submit spark pod, can't percept some spark resource info, such as spark driver service name. User may want to expose some spark pod info to build their custom resource, such as ingress, etc. We want the way expose now spark driver service name, which is now generated by clock and uuid. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40763) Should expose driver service name to config for user features
[ https://issues.apache.org/jira/browse/SPARK-40763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616138#comment-17616138 ] Apache Spark commented on SPARK-40763: -- User 'zwangsheng' has created a pull request for this issue: https://github.com/apache/spark/pull/38202 > Should expose driver service name to config for user features > - > > Key: SPARK-40763 > URL: https://issues.apache.org/jira/browse/SPARK-40763 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.3.0 >Reporter: binjie yang >Priority: Minor > > Current on kubernetes, user's feature step, which build user's kubernetes > resource during spark submit spark pod, can't percept some spark resource > info, such as spark driver service name. > > User may want to expose some spark pod info to build their custom resource, > such as ingress, etc. > > We want the way expose now spark driver service name, which is now generated > by clock and uuid. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37940) Use error classes in the compilation errors of partitions
[ https://issues.apache.org/jira/browse/SPARK-37940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616110#comment-17616110 ] BingKun Pan commented on SPARK-37940: - [~maxgekk] okay, I will work on this! > Use error classes in the compilation errors of partitions > - > > Key: SPARK-37940 > URL: https://issues.apache.org/jira/browse/SPARK-37940 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * unsupportedIfNotExistsError > * nonPartitionColError > * missingStaticPartitionColumn > * alterV2TableSetLocationWithPartitionNotSupportedError > * invalidPartitionSpecError > * partitionNotSpecifyLocationUriError > * describeDoesNotSupportPartitionForV2TablesError > * tableDoesNotSupportPartitionManagementError > * tableDoesNotSupportAtomicPartitionManagementError > * alterTableRecoverPartitionsNotSupportedForV2TablesError > * partitionColumnNotSpecifiedError > * invalidPartitionColumnError > * multiplePartitionColumnValuesSpecifiedError > * cannotUseDataTypeForPartitionColumnError > * cannotUseAllColumnsForPartitionColumnsError > * partitionColumnNotFoundInSchemaError > * mismatchedTablePartitionColumnError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40457) upgrade jackson data mapper to latest
[ https://issues.apache.org/jira/browse/SPARK-40457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616103#comment-17616103 ] Hyukjin Kwon commented on SPARK-40457: -- We're going to drop Hadoop 2 from Apache Spark 3.4. is this still an issue? > upgrade jackson data mapper to latest > -- > > Key: SPARK-40457 > URL: https://issues.apache.org/jira/browse/SPARK-40457 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Bilna >Priority: Major > > Upgrade jackson-mapper-asl to the latest to resolve CVE-2019-10172 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38959) DataSource V2: Support runtime group filtering in row-level commands
[ https://issues.apache.org/jira/browse/SPARK-38959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-38959: - Assignee: Anton Okolnychyi > DataSource V2: Support runtime group filtering in row-level commands > > > Key: SPARK-38959 > URL: https://issues.apache.org/jira/browse/SPARK-38959 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Anton Okolnychyi >Assignee: Anton Okolnychyi >Priority: Major > > Spark has to support runtime filtering of groups during row-level operations > for group-based sources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38959) DataSource V2: Support runtime group filtering in row-level commands
[ https://issues.apache.org/jira/browse/SPARK-38959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-38959. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36304 [https://github.com/apache/spark/pull/36304] > DataSource V2: Support runtime group filtering in row-level commands > > > Key: SPARK-38959 > URL: https://issues.apache.org/jira/browse/SPARK-38959 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Anton Okolnychyi >Assignee: Anton Okolnychyi >Priority: Major > Fix For: 3.4.0 > > > Spark has to support runtime filtering of groups during row-level operations > for group-based sources. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40416) Add error classes for subquery expression CheckAnalysis failures
[ https://issues.apache.org/jira/browse/SPARK-40416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616057#comment-17616057 ] Apache Spark commented on SPARK-40416: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/38210 > Add error classes for subquery expression CheckAnalysis failures > > > Key: SPARK-40416 > URL: https://issues.apache.org/jira/browse/SPARK-40416 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40416) Add error classes for subquery expression CheckAnalysis failures
[ https://issues.apache.org/jira/browse/SPARK-40416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616056#comment-17616056 ] Apache Spark commented on SPARK-40416: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/38210 > Add error classes for subquery expression CheckAnalysis failures > > > Key: SPARK-40416 > URL: https://issues.apache.org/jira/browse/SPARK-40416 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40658) Protobuf v2 & v3 support
[ https://issues.apache.org/jira/browse/SPARK-40658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17616010#comment-17616010 ] Raghu Angadi commented on SPARK-40658: -- > If we can get away without specifying V2 or V3 or ANY, that would be the > simplest. Agree. Better to start with out such an option. It might matter in a small number of corner cases, as being discussed above. Lets see if we can a good policy without having user to choose. > Protobuf v2 & v3 support > > > Key: SPARK-40658 > URL: https://issues.apache.org/jira/browse/SPARK-40658 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.3.0 >Reporter: Raghu Angadi >Priority: Major > > We want to ensure Protobuf functions support both Protobuf version 2 and > version 3 schemas (e.g. descriptor file or compiled classes with v2 and v3). > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615997#comment-17615997 ] pralabhkumar commented on SPARK-33782: -- [~hyukjin.kwon] Can u please help to review the PR . It would be of great help . > Place spark.files, spark.jars and spark.files under the current working > directory on the driver in K8S cluster mode > --- > > Key: SPARK-33782 > URL: https://issues.apache.org/jira/browse/SPARK-33782 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Priority: Major > > In Yarn cluster modes, the passed files are able to be accessed in the > current working directory. Looks like this is not the case in Kubernates > cluset mode. > By doing this, users can, for example, leverage PEX to manage Python > dependences in Apache Spark: > {code} > pex pyspark==3.0.1 pyarrow==0.15.1 pandas==0.25.3 -o myarchive.pex > PYSPARK_PYTHON=./myarchive.pex spark-submit --files myarchive.pex > {code} > See also https://github.com/apache/spark/pull/30735/files#r540935585. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21195) MetricSystem should pick up dynamically registered metrics in sources
[ https://issues.apache.org/jira/browse/SPARK-21195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615987#comment-17615987 ] Apache Spark commented on SPARK-21195: -- User 'robert3005' has created a pull request for this issue: https://github.com/apache/spark/pull/38209 > MetricSystem should pick up dynamically registered metrics in sources > - > > Key: SPARK-21195 > URL: https://issues.apache.org/jira/browse/SPARK-21195 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Robert Kruszewski >Priority: Minor > Labels: bulk-closed > > Currently when MetricsSystem registers a source it only picks up currently > registered metrics. It's quite cumbersome and leads to a lot of boilerplate > to preregister all metrics especially with systems that use instrumentation. > This change proposes to teach MetricsSystem to watch metrics added to sources > and dynamically register them -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40762) Check error classes in ErrorParserSuite
[ https://issues.apache.org/jira/browse/SPARK-40762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40762: Assignee: Apache Spark (was: Max Gekk) > Check error classes in ErrorParserSuite > --- > > Key: SPARK-40762 > URL: https://issues.apache.org/jira/browse/SPARK-40762 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Check error classes in ErrorParserSuite by using checkError(). For instance, > replace > {code:scala} > intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, > "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not > supported", > "^^^") > {code} > by > {code:scala} > checkError( > exception = parseException("select *\nfrom r\norder by q\ncluster by > q"), > errorClass = "_LEGACY_ERROR_TEMP_0011", > parameters = Map.empty, > context = ExpectedContext(fragment = "order by q\ncluster by q", start > = 16, stop = 38)) > {code} > at > https://github.com/apache/spark/blob/96ed6dc3e38b22b4aefc41e20ba7c953f8f2251e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala#L79-L83 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40762) Check error classes in ErrorParserSuite
[ https://issues.apache.org/jira/browse/SPARK-40762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615973#comment-17615973 ] Apache Spark commented on SPARK-40762: -- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/38204 > Check error classes in ErrorParserSuite > --- > > Key: SPARK-40762 > URL: https://issues.apache.org/jira/browse/SPARK-40762 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Check error classes in ErrorParserSuite by using checkError(). For instance, > replace > {code:scala} > intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, > "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not > supported", > "^^^") > {code} > by > {code:scala} > checkError( > exception = parseException("select *\nfrom r\norder by q\ncluster by > q"), > errorClass = "_LEGACY_ERROR_TEMP_0011", > parameters = Map.empty, > context = ExpectedContext(fragment = "order by q\ncluster by q", start > = 16, stop = 38)) > {code} > at > https://github.com/apache/spark/blob/96ed6dc3e38b22b4aefc41e20ba7c953f8f2251e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala#L79-L83 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40762) Check error classes in ErrorParserSuite
[ https://issues.apache.org/jira/browse/SPARK-40762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615974#comment-17615974 ] Apache Spark commented on SPARK-40762: -- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/38204 > Check error classes in ErrorParserSuite > --- > > Key: SPARK-40762 > URL: https://issues.apache.org/jira/browse/SPARK-40762 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Check error classes in ErrorParserSuite by using checkError(). For instance, > replace > {code:scala} > intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, > "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not > supported", > "^^^") > {code} > by > {code:scala} > checkError( > exception = parseException("select *\nfrom r\norder by q\ncluster by > q"), > errorClass = "_LEGACY_ERROR_TEMP_0011", > parameters = Map.empty, > context = ExpectedContext(fragment = "order by q\ncluster by q", start > = 16, stop = 38)) > {code} > at > https://github.com/apache/spark/blob/96ed6dc3e38b22b4aefc41e20ba7c953f8f2251e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala#L79-L83 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40762) Check error classes in ErrorParserSuite
[ https://issues.apache.org/jira/browse/SPARK-40762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40762: Assignee: Max Gekk (was: Apache Spark) > Check error classes in ErrorParserSuite > --- > > Key: SPARK-40762 > URL: https://issues.apache.org/jira/browse/SPARK-40762 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Check error classes in ErrorParserSuite by using checkError(). For instance, > replace > {code:scala} > intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, > "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not > supported", > "^^^") > {code} > by > {code:scala} > checkError( > exception = parseException("select *\nfrom r\norder by q\ncluster by > q"), > errorClass = "_LEGACY_ERROR_TEMP_0011", > parameters = Map.empty, > context = ExpectedContext(fragment = "order by q\ncluster by q", start > = 16, stop = 38)) > {code} > at > https://github.com/apache/spark/blob/96ed6dc3e38b22b4aefc41e20ba7c953f8f2251e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala#L79-L83 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37946) Use error classes in the execution errors related to partitions
[ https://issues.apache.org/jira/browse/SPARK-37946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-37946: Assignee: (was: Apache Spark) > Use error classes in the execution errors related to partitions > --- > > Key: SPARK-37946 > URL: https://issues.apache.org/jira/browse/SPARK-37946 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryExecutionErrors: > * unableToDeletePartitionPathError > * unableToCreatePartitionPathError > * unableToRenamePartitionPathError > * notADatasourceRDDPartitionError > * cannotClearPartitionDirectoryError > * failedToCastValueToDataTypeForPartitionColumnError > * unsupportedPartitionTransformError > * cannotCreateJDBCTableWithPartitionsError > * requestedPartitionsMismatchTablePartitionsError > * dynamicPartitionKeyNotAmongWrittenPartitionPathsError > * cannotRemovePartitionDirError > * alterTableWithDropPartitionAndPurgeUnsupportedError > * invalidPartitionFilterError > * getPartitionMetadataByFilterError > * illegalLocationClauseForViewPartitionError > * partitionColumnNotFoundInSchemaError > * cannotAddMultiPartitionsOnNonatomicPartitionTableError > * cannotDropMultiPartitionsOnNonatomicPartitionTableError > * truncateMultiPartitionUnsupportedError > * dynamicPartitionOverwriteUnsupportedByTableError > * writePartitionExceedConfigSizeWhenDynamicPartitionError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryExecutionErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40760) Migrate type check failures of interval expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40760: Assignee: (was: Max Gekk) > Migrate type check failures of interval expressions onto error classes > -- > > Key: SPARK-40760 > URL: https://issues.apache.org/jira/browse/SPARK-40760 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the interval > expressions: > 1. Average > https://github.com/apache/spark/blob/47d119dfc1a06ee2d520396129b4f09bc22d3fb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala#L78 > 2. ApproxCountDistinctForIntervals (3): > https://github.com/apache/spark/blob/08123a3795683238352e5bf55452de381349fdd9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala#L80-L91 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40761) Migrate type check failures of percentile expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40761: Assignee: (was: Max Gekk) > Migrate type check failures of percentile expressions onto error classes > > > Key: SPARK-40761 > URL: https://issues.apache.org/jira/browse/SPARK-40761 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the percentile > expressions: > 1. Percentile (4): > https://github.com/apache/spark/blob/3f3201a7882b817a8a3ecbfeb369dde01e7689d8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala#L122-L129 > 2. PercentileBase (3): > https://github.com/apache/spark/blob/8559f88f8ed9d22751975150a6d5735653a1e528/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala#L87-L93 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40759) Migrate type check failures of time window onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40759: Assignee: (was: Max Gekk) > Migrate type check failures of time window onto error classes > - > > Key: SPARK-40759 > URL: https://issues.apache.org/jira/browse/SPARK-40759 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the time > window expressions: > 1. TimeWindow (4): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala#L117-L127 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40762) Check error classes in ErrorParserSuite
Max Gekk created SPARK-40762: Summary: Check error classes in ErrorParserSuite Key: SPARK-40762 URL: https://issues.apache.org/jira/browse/SPARK-40762 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Assignee: Max Gekk Check error classes in ErrorParserSuite by using checkError(). For instance, replace {code:scala} intercept("select *\nfrom r\norder by q\ncluster by q", 3, 0, 11, "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not supported", "^^^") {code} by {code:scala} checkError( exception = parseException("select *\nfrom r\norder by q\ncluster by q"), errorClass = "_LEGACY_ERROR_TEMP_0011", parameters = Map.empty, context = ExpectedContext(fragment = "order by q\ncluster by q", start = 16, stop = 38)) {code} at https://github.com/apache/spark/blob/96ed6dc3e38b22b4aefc41e20ba7c953f8f2251e/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala#L79-L83 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40361) Migrate arithmetic type check failures onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615962#comment-17615962 ] Apache Spark commented on SPARK-40361: -- User 'lvshaokang' has created a pull request for this issue: https://github.com/apache/spark/pull/38208 > Migrate arithmetic type check failures onto error classes > - > > Key: SPARK-40361 > URL: https://issues.apache.org/jira/browse/SPARK-40361 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in arithmetic > expressions: > 1. Least (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1188-L1191 > 2. Greatest (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1266-L1269 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40361) Migrate arithmetic type check failures onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40361: Assignee: (was: Apache Spark) > Migrate arithmetic type check failures onto error classes > - > > Key: SPARK-40361 > URL: https://issues.apache.org/jira/browse/SPARK-40361 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in arithmetic > expressions: > 1. Least (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1188-L1191 > 2. Greatest (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1266-L1269 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40361) Migrate arithmetic type check failures onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40361: Assignee: Apache Spark > Migrate arithmetic type check failures onto error classes > - > > Key: SPARK-40361 > URL: https://issues.apache.org/jira/browse/SPARK-40361 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in arithmetic > expressions: > 1. Least (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1188-L1191 > 2. Greatest (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1266-L1269 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40361) Migrate arithmetic type check failures onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615961#comment-17615961 ] Apache Spark commented on SPARK-40361: -- User 'lvshaokang' has created a pull request for this issue: https://github.com/apache/spark/pull/38208 > Migrate arithmetic type check failures onto error classes > - > > Key: SPARK-40361 > URL: https://issues.apache.org/jira/browse/SPARK-40361 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in arithmetic > expressions: > 1. Least (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1188-L1191 > 2. Greatest (2): > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L1266-L1269 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40745) Reduce the shuffle size of ALS in mllib
[ https://issues.apache.org/jira/browse/SPARK-40745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40745. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38203 [https://github.com/apache/spark/pull/38203] > Reduce the shuffle size of ALS in mllib > --- > > Key: SPARK-40745 > URL: https://issues.apache.org/jira/browse/SPARK-40745 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40745) Reduce the shuffle size of ALS in mllib
[ https://issues.apache.org/jira/browse/SPARK-40745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-40745: Assignee: Ruifeng Zheng > Reduce the shuffle size of ALS in mllib > --- > > Key: SPARK-40745 > URL: https://issues.apache.org/jira/browse/SPARK-40745 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40761) Migrate type check failures of percentile expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40761: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the percentile expressions: 1. Percentile (4): https://github.com/apache/spark/blob/3f3201a7882b817a8a3ecbfeb369dde01e7689d8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala#L122-L129 2. PercentileBase (3): https://github.com/apache/spark/blob/8559f88f8ed9d22751975150a6d5735653a1e528/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala#L87-L93 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the interval expressions: 1. Average https://github.com/apache/spark/blob/47d119dfc1a06ee2d520396129b4f09bc22d3fb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala#L78 2. ApproxCountDistinctForIntervals (3): https://github.com/apache/spark/blob/08123a3795683238352e5bf55452de381349fdd9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala#L80-L91 > Migrate type check failures of percentile expressions onto error classes > > > Key: SPARK-40761 > URL: https://issues.apache.org/jira/browse/SPARK-40761 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the percentile > expressions: > 1. Percentile (4): > https://github.com/apache/spark/blob/3f3201a7882b817a8a3ecbfeb369dde01e7689d8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala#L122-L129 > 2. PercentileBase (3): > https://github.com/apache/spark/blob/8559f88f8ed9d22751975150a6d5735653a1e528/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/percentiles.scala#L87-L93 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40761) Migrate type check failures of percentile expressions onto error classes
Max Gekk created SPARK-40761: Summary: Migrate type check failures of percentile expressions onto error classes Key: SPARK-40761 URL: https://issues.apache.org/jira/browse/SPARK-40761 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Assignee: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the interval expressions: 1. Average https://github.com/apache/spark/blob/47d119dfc1a06ee2d520396129b4f09bc22d3fb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala#L78 2. ApproxCountDistinctForIntervals (3): https://github.com/apache/spark/blob/08123a3795683238352e5bf55452de381349fdd9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala#L80-L91 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40457) upgrade jackson data mapper to latest
[ https://issues.apache.org/jira/browse/SPARK-40457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615907#comment-17615907 ] Bilna commented on SPARK-40457: --- This link: https://github.com/bjornjorgensen/spark/security/dependabot/1 is giving 404 > upgrade jackson data mapper to latest > -- > > Key: SPARK-40457 > URL: https://issues.apache.org/jira/browse/SPARK-40457 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Bilna >Priority: Major > > Upgrade jackson-mapper-asl to the latest to resolve CVE-2019-10172 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40760) Migrate type check failures of interval expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40760: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the interval expressions: 1. Average https://github.com/apache/spark/blob/47d119dfc1a06ee2d520396129b4f09bc22d3fb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala#L78 2. ApproxCountDistinctForIntervals (3): https://github.com/apache/spark/blob/08123a3795683238352e5bf55452de381349fdd9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala#L80-L91 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the time window expressions: 1. TimeWindow (4): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala#L117-L127 > Migrate type check failures of interval expressions onto error classes > -- > > Key: SPARK-40760 > URL: https://issues.apache.org/jira/browse/SPARK-40760 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the interval > expressions: > 1. Average > https://github.com/apache/spark/blob/47d119dfc1a06ee2d520396129b4f09bc22d3fb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala#L78 > 2. ApproxCountDistinctForIntervals (3): > https://github.com/apache/spark/blob/08123a3795683238352e5bf55452de381349fdd9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala#L80-L91 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xiaoping.huang updated SPARK-40753: --- Component/s: (was: Tests) > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40760) Migrate type check failures of interval expressions onto error classes
Max Gekk created SPARK-40760: Summary: Migrate type check failures of interval expressions onto error classes Key: SPARK-40760 URL: https://issues.apache.org/jira/browse/SPARK-40760 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Assignee: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the time window expressions: 1. TimeWindow (4): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala#L117-L127 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40759) Migrate type check failures of time window onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40759: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the time window expressions: 1. TimeWindow (4): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala#L117-L127 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the string and regexp expressions: 1. Elt (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala#L276-L284 2. RegExpReplace (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L597-L604 > Migrate type check failures of time window onto error classes > - > > Key: SPARK-40759 > URL: https://issues.apache.org/jira/browse/SPARK-40759 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the time > window expressions: > 1. TimeWindow (4): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TimeWindow.scala#L117-L127 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40759) Migrate type check failures of time window onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-40759: Assignee: Max Gekk > Migrate type check failures of time window onto error classes > - > > Key: SPARK-40759 > URL: https://issues.apache.org/jira/browse/SPARK-40759 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the string and > regexp expressions: > 1. Elt (3): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala#L276-L284 > 2. RegExpReplace (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L597-L604 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40759) Migrate type check failures of time window onto error classes
Max Gekk created SPARK-40759: Summary: Migrate type check failures of time window onto error classes Key: SPARK-40759 URL: https://issues.apache.org/jira/browse/SPARK-40759 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the string and regexp expressions: 1. Elt (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala#L276-L284 2. RegExpReplace (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L597-L604 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40758) Upgrade Apache zookeeper to get rid of CVE-2020-10663
Bilna created SPARK-40758: - Summary: Upgrade Apache zookeeper to get rid of CVE-2020-10663 Key: SPARK-40758 URL: https://issues.apache.org/jira/browse/SPARK-40758 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: Bilna In order to resolve security vulnerability CVE-2020-10663, upgrade Apache zookeeper to 3.8.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40756) Migrate type check failures of string expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40756: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the string and regexp expressions: 1. Elt (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala#L276-L284 2. RegExpReplace (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L597-L604 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the conditional expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 3. InSubquery (2); https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L378-L396 4. In (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L453 > Migrate type check failures of string expressions onto error classes > > > Key: SPARK-40756 > URL: https://issues.apache.org/jira/browse/SPARK-40756 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the string and > regexp expressions: > 1. Elt (3): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala#L276-L284 > 2. RegExpReplace (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L597-L604 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40757) Add PULL_REQUEST_TEMPLATE for spark-docker
Yikun Jiang created SPARK-40757: --- Summary: Add PULL_REQUEST_TEMPLATE for spark-docker Key: SPARK-40757 URL: https://issues.apache.org/jira/browse/SPARK-40757 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 3.4.0 Reporter: Yikun Jiang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40756) Migrate type check failures of string expressions onto error classes
Max Gekk created SPARK-40756: Summary: Migrate type check failures of string expressions onto error classes Key: SPARK-40756 URL: https://issues.apache.org/jira/browse/SPARK-40756 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the conditional expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 3. InSubquery (2); https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L378-L396 4. In (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L453 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40748) Migrate type check failures of conditions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40748: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the conditional expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 3. InSubquery (2); https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L378-L396 4. In (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L453 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the conditional expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 > Migrate type check failures of conditions onto error classes > > > Key: SPARK-40748 > URL: https://issues.apache.org/jira/browse/SPARK-40748 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the > conditional expressions: > 1. If (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 > 2. CaseWhen (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 > 3. InSubquery (2); > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L378-L396 > 4. In (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L453 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40755) Migrate type check failures of number formatting onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40755: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the number formatting or parsing expressions: 1. ToNumber (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala#L83 2. ToCharacter (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala#L227 3. ToNumberParser (1): https://github.com/apache/spark/blob/5556cfc59aa97a3ad4ea0baacebe19859ec0bcb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala#L262 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the misc expressions: 1. Coalesce(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala#L60 2. SortOrder(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala#L75 3. UnwrapUDT(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnwrapUDT.scala#L36 4. ParseUrl(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala#L185 5. XPathExtract(1) https://github.com/apache/spark/blob/a241256ed0778005245253fb147db8a16105f75c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala#L45 > Migrate type check failures of number formatting onto error classes > --- > > Key: SPARK-40755 > URL: https://issues.apache.org/jira/browse/SPARK-40755 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the number > formatting or parsing expressions: > 1. ToNumber (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala#L83 > 2. ToCharacter (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala#L227 > 3. ToNumberParser (1): > https://github.com/apache/spark/blob/5556cfc59aa97a3ad4ea0baacebe19859ec0bcb7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala#L262 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40755) Migrate type check failures of number formatting onto error classes
Max Gekk created SPARK-40755: Summary: Migrate type check failures of number formatting onto error classes Key: SPARK-40755 URL: https://issues.apache.org/jira/browse/SPARK-40755 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the misc expressions: 1. Coalesce(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala#L60 2. SortOrder(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala#L75 3. UnwrapUDT(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnwrapUDT.scala#L36 4. ParseUrl(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala#L185 5. XPathExtract(1) https://github.com/apache/spark/blob/a241256ed0778005245253fb147db8a16105f75c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala#L45 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615875#comment-17615875 ] Apache Spark commented on SPARK-40753: -- User 'huangxiaopingRD' has created a pull request for this issue: https://github.com/apache/spark/pull/38206 > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40753: Assignee: Apache Spark > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Assignee: Apache Spark >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40753: Assignee: (was: Apache Spark) > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615874#comment-17615874 ] Apache Spark commented on SPARK-40753: -- User 'huangxiaopingRD' has created a pull request for this issue: https://github.com/apache/spark/pull/38206 > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40754) Add LICENSE and NOTICE for apache/spark-docker
Yikun Jiang created SPARK-40754: --- Summary: Add LICENSE and NOTICE for apache/spark-docker Key: SPARK-40754 URL: https://issues.apache.org/jira/browse/SPARK-40754 Project: Spark Issue Type: Sub-task Components: Documentation Affects Versions: 3.4.0 Reporter: Yikun Jiang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40753) Fix bug in test case for catalog directory operation
[ https://issues.apache.org/jira/browse/SPARK-40753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xiaoping.huang updated SPARK-40753: --- Description: The implementation class of ExternalCatalog will perform folder operations when performing operations such as create/drop database/table/partition. The test case creates a folder in advance when obtaining the DB/Partition path URI, resulting in the result of the test case is not convincing enough force. (was: The implementation class of ExternalCatalog will perform folder operations when performing operations such as create/drop database/table/partition. The test case creates a folder in advance in order to obtain the DB/Partition path URI, so the test case results are not convincing. force) > Fix bug in test case for catalog directory operation > > > Key: SPARK-40753 > URL: https://issues.apache.org/jira/browse/SPARK-40753 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Priority: Minor > > The implementation class of ExternalCatalog will perform folder operations > when performing operations such as create/drop database/table/partition. The > test case creates a folder in advance when obtaining the DB/Partition path > URI, resulting in the result of the test case is not convincing enough force. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40753) Fix bug in test case for catalog directory operation
xiaoping.huang created SPARK-40753: -- Summary: Fix bug in test case for catalog directory operation Key: SPARK-40753 URL: https://issues.apache.org/jira/browse/SPARK-40753 Project: Spark Issue Type: Bug Components: SQL, Tests Affects Versions: 3.4.0 Reporter: xiaoping.huang The implementation class of ExternalCatalog will perform folder operations when performing operations such as create/drop database/table/partition. The test case creates a folder in advance in order to obtain the DB/Partition path URI, so the test case results are not convincing. force -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33598) Support Java Class with circular references
[ https://issues.apache.org/jira/browse/SPARK-33598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615853#comment-17615853 ] Vivek Sharma commented on SPARK-33598: -- Thanks [~jacklzg] , I see the PR, however it seems there are failing tests with this change and of course someone on the Apache Spark PMC / Committer has to review and get it to a mergeable state. But yes, I agree, even I am facing issue converting a Dataset into a Dataset due to the circular reference of class com.google.protobuf.Descriptors$Descriptor that is present in any Google protobuf generated code. > Support Java Class with circular references > --- > > Key: SPARK-33598 > URL: https://issues.apache.org/jira/browse/SPARK-33598 > Project: Spark > Issue Type: Improvement > Components: Java API >Affects Versions: 3.1.2 >Reporter: jacklzg >Priority: Minor > > If the target Java data class has a circular reference, Spark will fail fast > from creating the Dataset or running Encoders. > > For example, with protobuf class, there is a reference with Descriptor, there > is no way to build a dataset from the protobuf class. > From this line > {color:#7a869a}Encoders.bean(ProtoBuffOuterClass.ProtoBuff.class);{color} > > It will throw out immediately > > {quote}Exception in thread "main" java.lang.UnsupportedOperationException: > Cannot have circular references in bean class, but got the circular reference > of class class com.google.protobuf.Descriptors$Descriptor > {quote} > > Can we add a parameter, for example, > > {code:java} > Encoders.bean(Class clas, List fieldsToIgnore);{code} > > or > > {code:java} > Encoders.bean(Class clas, boolean skipCircularRefField);{code} > > which subsequently, instead of throwing an exception @ > [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L556], > it instead skip the field. > > {code:java} > if (seenTypeSet.contains(t)) { > if(skipCircularRefField) > println("field skipped") //just skip this field > else throw new UnsupportedOperationException( s"cannot have circular > references in class, but got the circular reference of class $t") > } > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37940) Use error classes in the compilation errors of partitions
[ https://issues.apache.org/jira/browse/SPARK-37940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615820#comment-17615820 ] Max Gekk edited comment on SPARK-37940 at 10/11/22 1:50 PM: [~panbingkun] Would you like to work on this? was (Author: maxgekk): @panbingkun Would you like to work on this? > Use error classes in the compilation errors of partitions > - > > Key: SPARK-37940 > URL: https://issues.apache.org/jira/browse/SPARK-37940 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * unsupportedIfNotExistsError > * nonPartitionColError > * missingStaticPartitionColumn > * alterV2TableSetLocationWithPartitionNotSupportedError > * invalidPartitionSpecError > * partitionNotSpecifyLocationUriError > * describeDoesNotSupportPartitionForV2TablesError > * tableDoesNotSupportPartitionManagementError > * tableDoesNotSupportAtomicPartitionManagementError > * alterTableRecoverPartitionsNotSupportedForV2TablesError > * partitionColumnNotSpecifiedError > * invalidPartitionColumnError > * multiplePartitionColumnValuesSpecifiedError > * cannotUseDataTypeForPartitionColumnError > * cannotUseAllColumnsForPartitionColumnsError > * partitionColumnNotFoundInSchemaError > * mismatchedTablePartitionColumnError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37940) Use error classes in the compilation errors of partitions
[ https://issues.apache.org/jira/browse/SPARK-37940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615820#comment-17615820 ] Max Gekk commented on SPARK-37940: -- @panbingkun Would you like to work on this? > Use error classes in the compilation errors of partitions > - > > Key: SPARK-37940 > URL: https://issues.apache.org/jira/browse/SPARK-37940 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryCompilationErrors: > * unsupportedIfNotExistsError > * nonPartitionColError > * missingStaticPartitionColumn > * alterV2TableSetLocationWithPartitionNotSupportedError > * invalidPartitionSpecError > * partitionNotSpecifyLocationUriError > * describeDoesNotSupportPartitionForV2TablesError > * tableDoesNotSupportPartitionManagementError > * tableDoesNotSupportAtomicPartitionManagementError > * alterTableRecoverPartitionsNotSupportedForV2TablesError > * partitionColumnNotSpecifiedError > * invalidPartitionColumnError > * multiplePartitionColumnValuesSpecifiedError > * cannotUseDataTypeForPartitionColumnError > * cannotUseAllColumnsForPartitionColumnsError > * partitionColumnNotFoundInSchemaError > * mismatchedTablePartitionColumnError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40752) Migrate type check failures of misc expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40752: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the misc expressions: 1. Coalesce(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala#L60 2. SortOrder(1) https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala#L75 3. UnwrapUDT(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnwrapUDT.scala#L36 4. ParseUrl(1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala#L185 5. XPathExtract(1) https://github.com/apache/spark/blob/a241256ed0778005245253fb147db8a16105f75c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala#L45 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the high-order functions expressions: 1. ArraySort (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L403-L407 2. ArrayAggregate (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L807 3. MapZipWith (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L1028 > Migrate type check failures of misc expressions onto error classes > -- > > Key: SPARK-40752 > URL: https://issues.apache.org/jira/browse/SPARK-40752 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the misc > expressions: > 1. Coalesce(1) > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala#L60 > 2. SortOrder(1) > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala#L75 > 3. UnwrapUDT(1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnwrapUDT.scala#L36 > 4. ParseUrl(1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/urlExpressions.scala#L185 > 5. XPathExtract(1) > https://github.com/apache/spark/blob/a241256ed0778005245253fb147db8a16105f75c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala#L45 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40752) Migrate type check failures of misc expressions onto error classes
Max Gekk created SPARK-40752: Summary: Migrate type check failures of misc expressions onto error classes Key: SPARK-40752 URL: https://issues.apache.org/jira/browse/SPARK-40752 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the high-order functions expressions: 1. ArraySort (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L403-L407 2. ArrayAggregate (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L807 3. MapZipWith (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L1028 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40751) Migrate type check failures of high order functions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40751: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the high-order functions expressions: 1. ArraySort (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L403-L407 2. ArrayAggregate (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L807 3. MapZipWith (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L1028 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the math expressions: 1. HashExpression (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L271-L275 2. RoundBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1484 > Migrate type check failures of high order functions onto error classes > -- > > Key: SPARK-40751 > URL: https://issues.apache.org/jira/browse/SPARK-40751 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the high-order > functions expressions: > 1. ArraySort (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L403-L407 > 2. ArrayAggregate (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L807 > 3. MapZipWith (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala#L1028 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40751) Migrate type check failures of high order functions onto error classes
Max Gekk created SPARK-40751: Summary: Migrate type check failures of high order functions onto error classes Key: SPARK-40751 URL: https://issues.apache.org/jira/browse/SPARK-40751 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the math expressions: 1. HashExpression (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L271-L275 2. RoundBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1484 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40749) Migrate type check failures of generators onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40749: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the generator expressions: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 > Migrate type check failures of generators onto error classes > > > Key: SPARK-40749 > URL: https://issues.apache.org/jira/browse/SPARK-40749 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the generator > expressions: > 1. Stack (3): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 > 2. ExplodeBase (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 > 3. Inline (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40748) Migrate type check failures of conditions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40748: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the conditional expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 > Migrate type check failures of conditions onto error classes > > > Key: SPARK-40748 > URL: https://issues.apache.org/jira/browse/SPARK-40748 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the > conditional expressions: > 1. If (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 > 2. CaseWhen (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40750) Migrate type check failures of math expressions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40750: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the math expressions: 1. HashExpression (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L271-L275 2. RoundBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1484 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 > Migrate type check failures of math expressions onto error classes > -- > > Key: SPARK-40750 > URL: https://issues.apache.org/jira/browse/SPARK-40750 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the math > expressions: > 1. HashExpression (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L271-L275 > 2. RoundBase (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1484 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40750) Migrate type check failures of math expressions onto error classes
Max Gekk created SPARK-40750: Summary: Migrate type check failures of math expressions onto error classes Key: SPARK-40750 URL: https://issues.apache.org/jira/browse/SPARK-40750 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-37935) Migrate onto error classes
[ https://issues.apache.org/jira/browse/SPARK-37935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reopened SPARK-37935: -- Closed by sub-task PR https://github.com/apache/spark/pull/38051 mistakenly. > Migrate onto error classes > -- > > Key: SPARK-37935 > URL: https://issues.apache.org/jira/browse/SPARK-37935 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.4.0 > > > The PR https://github.com/apache/spark/pull/32850 introduced error classes as > a part of the error messages framework > (https://issues.apache.org/jira/browse/SPARK-33539). Need to migrate all > exceptions from QueryExecutionErrors, QueryCompilationErrors and > QueryParsingErrors on the error classes using instances of SparkThrowable, > and carefully test every error class by writing tests in dedicated test > suites: > * QueryExecutionErrorsSuite for the errors that are occurred during query > execution > * QueryCompilationErrorsSuite ... query compilation or eagerly executing > commands > * QueryParsingErrorsSuite ... parsing errors > Here is an example https://github.com/apache/spark/pull/35157 of how an > existing Java exception can be replaced, and testing of related error > classes.At the end, we should migrate all exceptions from the files > Query.*Errors.scala and cover all error classes from the error-classes.json > file by tests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40627) Eliminate error sub-classes
[ https://issues.apache.org/jira/browse/SPARK-40627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-40627. -- Assignee: Max Gekk Resolution: Fixed Resolved by https://github.com/apache/spark/pull/38051 > Eliminate error sub-classes > --- > > Key: SPARK-40627 > URL: https://issues.apache.org/jira/browse/SPARK-40627 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Remove the error sub-class field from Spark exceptions, `checkError()` and > from the JSON format. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40749) Migrate type check failures of generators onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40749: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 > Migrate type check failures of generators onto error classes > > > Key: SPARK-40749 > URL: https://issues.apache.org/jira/browse/SPARK-40749 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex > type creator expressions: > 1. Stack (3): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 > 2. ExplodeBase (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 > 3. Inline (1): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40735) Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable
[ https://issues.apache.org/jira/browse/SPARK-40735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40735. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38191 [https://github.com/apache/spark/pull/38191] > Consistently invoke bash with /usr/bin/env bash in scripts to make code more > portable > - > > Key: SPARK-40735 > URL: https://issues.apache.org/jira/browse/SPARK-40735 > Project: Spark > Issue Type: Improvement > Components: Connect, Kubernetes, R, Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Assignee: xiaoping.huang >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40735) Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable
[ https://issues.apache.org/jira/browse/SPARK-40735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-40735: Assignee: xiaoping.huang > Consistently invoke bash with /usr/bin/env bash in scripts to make code more > portable > - > > Key: SPARK-40735 > URL: https://issues.apache.org/jira/browse/SPARK-40735 > Project: Spark > Issue Type: Improvement > Components: Connect, Kubernetes, R, Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: xiaoping.huang >Assignee: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40749) Migrate type check failures of generators onto error classes
Max Gekk created SPARK-40749: Summary: Migrate type check failures of generators onto error classes Key: SPARK-40749 URL: https://issues.apache.org/jira/browse/SPARK-40749 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40748) Migrate type check failures of conditions onto error classes
[ https://issues.apache.org/jira/browse/SPARK-40748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-40748: - Description: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. If (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 2. CaseWhen (2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 was: Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. CreateMap(3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L205-L214 2. CreateNamedStruct(3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L445-L457 3. UpdateFields(2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L670-L673 > Migrate type check failures of conditions onto error classes > > > Key: SPARK-40748 > URL: https://issues.apache.org/jira/browse/SPARK-40748 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex > type creator expressions: > 1. If (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L61-L67 > 2. CaseWhen (2): > https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L175-L183 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40748) Migrate type check failures of conditions onto error classes
Max Gekk created SPARK-40748: Summary: Migrate type check failures of conditions onto error classes Key: SPARK-40748 URL: https://issues.apache.org/jira/browse/SPARK-40748 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Replace TypeCheckFailure by DataTypeMismatch in type checks in the complex type creator expressions: 1. CreateMap(3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L205-L214 2. CreateNamedStruct(3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L445-L457 3. UpdateFields(2): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala#L670-L673 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40732) Double type precision problem
[ https://issues.apache.org/jira/browse/SPARK-40732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-40732: - Summary: Double type precision problem (was: Floating point precision changes) > Double type precision problem > - > > Key: SPARK-40732 > URL: https://issues.apache.org/jira/browse/SPARK-40732 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > > Some case in SQLQueryTestSuite(sql/core) and > ThriftServerQueryTestSuite(sql/hive-thriftserver) failed for this reason: > for example: > > {code:java} > SQLQueryTestSuite- > try_aggregates.sql *** FAILED *** > try_aggregates.sql > Expected "4.61168601842738[79]E18", but got "4.61168601842738[8]E18" Result > did not match for query #20 > SELECT try_avg(col) FROM VALUES (9223372036854775807L), (1L) AS tab(col) > (SQLQueryTestSuite.scala:495) > {code} > {code:java} > ThriftServerQueryTestSuite- try_aggregates.sql *** FAILED *** > Expected "4.61168601842738[79]E18", but got "4.61168601842738[8]E18" Result > did not match for query #20 > SELECT try_avg(col) FROM VALUES (9223372036854775807L), (1L) AS tab(col) > (ThriftServerQueryTestSuite.scala:222)- try_arithmetic.sql *** FAILED *** > Expected "-4.65661287307739[26]E-10", but got "-4.65661287307739[3]E-10" > Result did not match for query #26 > SELECT try_divide(1, (2147483647 + 1)) > (ThriftServerQueryTestSuite.scala:222)- datetime-formatting.sql *** FAILED *** > Expected "...-05-31 19:40:35.123 [3 > 1969-12-31 15:00:00 3 > 1970-12-31 04:59:59.999 3 > 1996-03-31 07:03:33.123 3 > 2018-11-17 05:33:33.123 3 > 2019-12-31 09:33:33.123 3] > 2100-01-01 01:33:33...", but got "...-05-31 19:40:35.123 [5 > 1969-12-31 15:00:00 5 > 1970-12-31 04:59:59.999 5 > 1996-03-31 07:03:33.123 5 > 2018-11-17 05:33:33.123 3 > 2019-12-31 09:33:33.123 5] > 2100-01-01 01:33:33..." Result did not match for query #8 > select col, date_format(col, 'F') from v > (ThriftServerQueryTestSuite.scala:222) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org