[jira] [Created] (SPARK-40994) Add code example for JDBC data source with partitionColumn
Cheng Su created SPARK-40994: Summary: Add code example for JDBC data source with partitionColumn Key: SPARK-40994 URL: https://issues.apache.org/jira/browse/SPARK-40994 Project: Spark Issue Type: Documentation Components: Documentation, SQL Affects Versions: 3.4.0 Reporter: Cheng Su We should add code example for JDBC data source with partitionColumn in our documentation - [https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html,] to better guide users. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39849) Dataset.as(StructType) fills missing new columns with null value
Cheng Su created SPARK-39849: Summary: Dataset.as(StructType) fills missing new columns with null value Key: SPARK-39849 URL: https://issues.apache.org/jira/browse/SPARK-39849 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Cheng Su As a followup of [https://github.com/apache/spark/pull/37011#discussion_r917700960] , it would be great to fill missing new columns with null values, instead of failing out loud. Note it would only work for nullable columns. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37333) Specify the required distribution at V1Write
[ https://issues.apache.org/jira/browse/SPARK-37333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568643#comment-17568643 ] Cheng Su commented on SPARK-37333: -- Just FYI, I am working on this in this week. The motivation is to support shuffling on bucket columns when writing Hive bucket table. cc [~cloud_fan], [~allisonwang-db] and [~ulysses] FYI. > Specify the required distribution at V1Write > > > Key: SPARK-37333 > URL: https://issues.apache.org/jira/browse/SPARK-37333 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: XiDuo You >Priority: Major > > An improvment of SPARK-37287. > We can specify the distribution at V1Write. e.g. the write is dynamic > partition, we may expect an output partitioning based on dynamic partition > columns. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37333) Specify the required distribution at V1Write
[ https://issues.apache.org/jira/browse/SPARK-37333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37333: - Affects Version/s: 3.4.0 (was: 3.3.0) > Specify the required distribution at V1Write > > > Key: SPARK-37333 > URL: https://issues.apache.org/jira/browse/SPARK-37333 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: XiDuo You >Priority: Major > > An improvment of SPARK-37287. > We can specify the distribution at V1Write. e.g. the write is dynamic > partition, we may expect an output partitioning based on dynamic partition > columns. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39777) Remove Hive bucketing incompatibility doc
Cheng Su created SPARK-39777: Summary: Remove Hive bucketing incompatibility doc Key: SPARK-39777 URL: https://issues.apache.org/jira/browse/SPARK-39777 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 3.4.0 Reporter: Cheng Su We support Hive bucketing (with Hive hash function) started from Spark 3.3.0, we should also update the documentation to reflect the fact, that we are no longer incompatible with Hive bucketing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39751) Better naming for hash aggregate key probing metric
Cheng Su created SPARK-39751: Summary: Better naming for hash aggregate key probing metric Key: SPARK-39751 URL: https://issues.apache.org/jira/browse/SPARK-39751 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Cheng Su Hash aggregate has a SQL metric to record average probes per key, but it has a very obsure name called "avg hash probe bucket list iters". We should give it a better name to avoid confusing users. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34960) Aggregate (Min/Max/Count) push down for ORC
[ https://issues.apache.org/jira/browse/SPARK-34960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504500#comment-17504500 ] Cheng Su commented on SPARK-34960: -- Thanks [~tgraves] and [~ahussein] for commenting, and yes, if any ORC file of table is missing statistics at file footer, the Spark query with aggregate push down would be failed loudly. I agree this is not good for user experience, and we are planning to work on runtime fallback to read from real rows in ORC file if no statistics. For now, if you have any concern to the feature, feel free to not enable in your environment, and that's the reason why we disable the feature by default to avoid failing any existing Spark workload. For now I will create a PR to add more documentation to mention the behavior i.e. fail the query if any file missing statistics. For Spark 3.4/next next release, the runtime fallback logic will probably be added as it's too tight to work on the feature for Spark 3.3 (we are doing branch cut in this month), and we have similar problem for Parquet aggregate push down as well. > Aggregate (Min/Max/Count) push down for ORC > --- > > Key: SPARK-34960 > URL: https://issues.apache.org/jira/browse/SPARK-34960 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Assignee: Cheng Su >Priority: Minor > Fix For: 3.3.0 > > Attachments: file_no_stats-orc.tar.gz > > > Similar to Parquet (https://issues.apache.org/jira/browse/SPARK-34952), we > can also push down certain aggregations into ORC. ORC exposes column > statistics in interface `org.apache.orc.Reader` > ([https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/Reader.java#L118] > ), where Spark can utilize for aggregation push down. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38354) Add hash probes metrics for shuffled hash join
Cheng Su created SPARK-38354: Summary: Add hash probes metrics for shuffled hash join Key: SPARK-38354 URL: https://issues.apache.org/jira/browse/SPARK-38354 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su For `HashAggregate` there's a SQL metrics to track number of hash probes per looked-up key. It would be better to add a similar metrics for shuffled hash join as well, to get some idea of hash probing performance. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38018) Fix ColumnVectorUtils.populate to handle CalendarIntervalType correctly
Cheng Su created SPARK-38018: Summary: Fix ColumnVectorUtils.populate to handle CalendarIntervalType correctly Key: SPARK-38018 URL: https://issues.apache.org/jira/browse/SPARK-38018 Project: Spark Issue Type: Bug Components: Spark Core, SQL Affects Versions: 3.3.0 Reporter: Cheng Su `ColumnVectorUtils.populate()` does not handle CalendarInterval type correctly - [https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVectorUtils.java#L93-L94] . The CalendarInterval type is in the format of (months: int, days: int, microseconds: long) ([https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java#L58] ). However, the function above misses `days` field, and sets `microseconds` field in wrong position. `ColumnVectorUtils.populate()` is used by Parquet ([https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java#L258] ) and ORC ([https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java#L171] )vectorized reader to read partition column. So technically Spark can potentially produce wrong result if reading table with CalendarInterval partition column. However I also notice Spark explicitly disallows writing data with CalendarInterval type ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L586] ), so it might not be a big deal for users. But it's worth to fix anyway. Caveat: I found the bug when reading through the related code path, and I don't have experience in production for partition column with CalendarInterval type. I think it should be an obvious fix unless anyone more experienced could find some historical context. The code was introduced a long time ago where I couldn't find any more info why it was implemented as it is ([https://github.com/apache/spark/pull/11435] ) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38015) Mark legacy file naming functions as deprecated in FileCommitProtocol
Cheng Su created SPARK-38015: Summary: Mark legacy file naming functions as deprecated in FileCommitProtocol Key: SPARK-38015 URL: https://issues.apache.org/jira/browse/SPARK-38015 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 3.3.0 Reporter: Cheng Su [FileCommitProtocol|https://github.com/apache/spark/blob/6bbfb45ffe75aa6c27a7bf3c3385a596637d1822/core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala] is the class to commit Spark job output (staging file & directory renaming, etc). During Spark 3.2 development, we added new functions into this class to allow more flexible output file naming (the PR detail is [here|https://github.com/apache/spark/pull/33012]). We didn’t delete the existing file naming functions (newTaskTempFile(ext) & newTaskTempFileAbsPath(ext)), because we were aware of many other downstream projects or codebases already implemented their own custom implementation for FileCommitProtocol. Delete the existing functions would be a breaking change for them when upgrading Spark version, and we would like to avoid this unpleasant surprise for anyone if possible. But we also need to clean up legacy as we evolve our codebase. So for next step, I would like to propose: * Spark 3.3 (now): Add @deprecate annotation to legacy functions in FileCommitProtocol - [newTaskTempFile(ext)|https://github.com/apache/spark/blob/6bbfb45ffe75aa6c27a7bf3c3385a596637d1822/core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala#L98] & [newTaskTempFileAbsPath(ext)|https://github.com/apache/spark/blob/6bbfb45ffe75aa6c27a7bf3c3385a596637d1822/core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala#L135]. * Next Spark major release (or whenever people feel comfortable): delete the legacy functions mentioned above from our codebase. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38006) Clean up duplicated planner logic for window operator
Cheng Su created SPARK-38006: Summary: Clean up duplicated planner logic for window operator Key: SPARK-38006 URL: https://issues.apache.org/jira/browse/SPARK-38006 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su Both `WindowExec.scala` and `WindowInPandasExec.scala` have some duplicated logic regarded to query planning (`output`, `requiredChildDistribution/Ordering`, `outputPartitioning/Ordering`). We can move these logic into their common parent `WindowExecBase` to reduce duplication. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18591) Replace hash-based aggregates with sort-based ones if inputs already sorted
[ https://issues.apache.org/jira/browse/SPARK-18591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480300#comment-17480300 ] Cheng Su commented on SPARK-18591: -- Just FYI, the Jira should be fixed by https://issues.apache.org/jira/browse/SPARK-37455 . The related code is merged already and should be released in next Spark release - Spark 3.3.0. > Replace hash-based aggregates with sort-based ones if inputs already sorted > --- > > Key: SPARK-18591 > URL: https://issues.apache.org/jira/browse/SPARK-18591 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.0.2 >Reporter: Takeshi Yamamuro >Priority: Major > Labels: bulk-closed > > Spark currently uses sort-based aggregates only in limited condition; the > cases where spark cannot use partial aggregates and hash-based ones. > However, if input ordering has already satisfied the requirements of > sort-based aggregates, it seems sort-based ones are faster than the other. > {code} > ./bin/spark-shell --conf spark.sql.shuffle.partitions=1 > val df = spark.range(1000).selectExpr("id AS key", "id % 10 AS > value").sort($"key").cache > def timer[R](block: => R): R = { > val t0 = System.nanoTime() > val result = block > val t1 = System.nanoTime() > println("Elapsed time: " + ((t1 - t0 + 0.0) / 10.0)+ "s") > result > } > timer { > df.groupBy("key").count().count > } > // codegen'd hash aggregate > Elapsed time: 7.116962977s > // non-codegen'd sort aggregarte > Elapsed time: 3.088816662s > {code} > If codegen'd sort-based aggregates are supported in SPARK-16844, this seems > to make the performance gap bigger; > {code} > - codegen'd sort aggregate > Elapsed time: 1.645234684s > {code} > Therefore, it'd be better to use sort-based ones in this case. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37983) Backout agg build time metrics from sort aggregate
Cheng Su created SPARK-37983: Summary: Backout agg build time metrics from sort aggregate Key: SPARK-37983 URL: https://issues.apache.org/jira/browse/SPARK-37983 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This is a followup of https://issues.apache.org/jira/browse/SPARK-37564 . I realize the agg build time metrics for sort aggregate is actually not correctly recorded. We don't have a hash build phase for sort aggregate, so there is really no way to measure so-called build time for sort aggregate. So here I make the change to back out the change introduced in [https://github.com/apache/spark/pull/34826] for agg build time metric. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37813) ORC read benchmark should enable vectorization for nested column
Cheng Su created SPARK-37813: Summary: ORC read benchmark should enable vectorization for nested column Key: SPARK-37813 URL: https://issues.apache.org/jira/browse/SPARK-37813 Project: Spark Issue Type: Test Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su We added an initial struct benchmark for ORC in [https://github.com/apache/spark/pull/35100] . However the benchmark does not enable vectorization for nested column. For comparison between vectorized and non-vectorized, we should enable vectorization for nested column. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37726) Add spill size metrics for sort merge join
Cheng Su created SPARK-37726: Summary: Add spill size metrics for sort merge join Key: SPARK-37726 URL: https://issues.apache.org/jira/browse/SPARK-37726 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su Sort merge join allows buffered side to spill if the size is too large to hold in memory. It would be good to add a "spill size" SQL metrics in sort merge join, to track how often the spill happens, and how much of spill size would be in case when it spills. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19256) Hive bucketing write support
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-19256: - Affects Version/s: 3.3.0 > Hive bucketing write support > > > Key: SPARK-19256 > URL: https://issues.apache.org/jira/browse/SPARK-19256 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0, 3.0.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Tejas Patil >Priority: Minor > > Update (2020 by Cheng Su): > We use this JIRA to track progress for Hive bucketing write support in Spark. > The goal is for Spark to write Hive bucketed table, to be compatible with > other compute engines (Hive and Presto). > > Current status for Hive bucketed table in Spark: > Not support for reading Hive bucketed table: read bucketed table as > non-bucketed table. > Wrong behavior for writing Hive ORC and Parquet bucketed table: write > orc/parquet bucketed table as non-bucketed table (code path: > InsertIntoHadoopFsRelationCommand -> FileFormatWriter). > Do not allow for writing Hive non-ORC/Parquet bucketed table: throw exception > by default if writing non-orc/parquet bucketed table (code path: > InsertIntoHiveTable), and exception can be disabled by setting config > `hive.enforce.bucketing`=false and `hive.enforce.sorting`=false, which will > write as non-bucketed table. > > Current status for Hive bucketed table in Hive: > Hive 3.0.0 and after: support writing bucketed table with Hive murmur3hash > (https://issues.apache.org/jira/browse/HIVE-18910). > Hive 1.x.y and 2.x.y: support writing bucketed table with Hive hivehash. > Hive on Tez: support zero and multiple files per bucket > (https://issues.apache.org/jira/browse/HIVE-14014). And more code pointer on > read path - > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java#L183-L212] > . > > Current status for Hive bucketed table in Presto (take presto-sql here): > Support writing bucketed table with Hive murmur3hash and hivehash > ([https://github.com/prestosql/presto/pull/1697]). > Support zero and multiple files per bucket > ([https://github.com/prestosql/presto/pull/822]). > > TLDR is to achieve Hive bucketed table compatibility across Spark, Presto and > Hive. Here with this JIRA, we need to add support writing Hive bucketed table > with Hive murmur3hash (for Hive 3.x.y) and hivehash (for Hive 1.x.y and > 2.x.y). > > To allow Spark efficiently read Hive bucketed table, this needs more radical > change and we decide to wait until data source v2 supports bucketing, and do > the read path on data source v2. Read path will not covered by this JIRA. > > Original description (2017 by Tejas Patil): > JIRA to track design discussions and tasks related to Hive bucketing support > in Spark. > Proposal : > [https://docs.google.com/document/d/1a8IDh23RAkrkg9YYAeO51F4aGO8-xAlupKwdshve2fc/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37564) Support sort aggregate code-gen without grouping keys
Cheng Su created SPARK-37564: Summary: Support sort aggregate code-gen without grouping keys Key: SPARK-37564 URL: https://issues.apache.org/jira/browse/SPARK-37564 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su We can support `SortAggregateExec` code-gen without grouping keys. When there's no grouping key, sort aggregate should share same execution logic with hash aggregate. Refactor the code-gen without grouping keys, from `HashAggregateExec` to a base class `AggregateCodegenSupport`, and `SortAggregateExec` can share the same implementation by extending `AggregateCodegenSupport`. The implementation of `SortAggregateExec` code-gen with grouping keys will be added in follow-up PR. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37557) Replace object hash with sort aggregate if child is already sorted
Cheng Su created SPARK-37557: Summary: Replace object hash with sort aggregate if child is already sorted Key: SPARK-37557 URL: https://issues.apache.org/jira/browse/SPARK-37557 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This is a follow up of [https://github.com/apache/spark/pull/34702#discussion_r762743589] , where we can replace object hash aggregate with sort aggregate as well. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34287) Spark Aggregate improvement
[ https://issues.apache.org/jira/browse/SPARK-34287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34287: - Summary: Spark Aggregate improvement (was: Aggregation improvement) > Spark Aggregate improvement > --- > > Key: SPARK-34287 > URL: https://issues.apache.org/jira/browse/SPARK-34287 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Creating this umbrella Jira to track overall progress for Spark aggregate > improvement. See each individual sub-task for details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34287) Spark aggregate improvement
[ https://issues.apache.org/jira/browse/SPARK-34287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34287: - Summary: Spark aggregate improvement (was: Spark Aggregate improvement) > Spark aggregate improvement > --- > > Key: SPARK-34287 > URL: https://issues.apache.org/jira/browse/SPARK-34287 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Creating this umbrella Jira to track overall progress for Spark aggregate > improvement. See each individual sub-task for details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34287) Aggregation improvement
[ https://issues.apache.org/jira/browse/SPARK-34287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34287: - Description: Creating this umbrella Jira to track overall progress for Spark aggregate improvement. See each individual sub-task for details. (was: Creating this umbrella Jira to track overall progress for object hash aggregation and sort aggregation improvement in execution area. See each individual sub-task for details.) > Aggregation improvement > --- > > Key: SPARK-34287 > URL: https://issues.apache.org/jira/browse/SPARK-34287 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Creating this umbrella Jira to track overall progress for Spark aggregate > improvement. See each individual sub-task for details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34287) Aggregation improvement
[ https://issues.apache.org/jira/browse/SPARK-34287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34287: - Summary: Aggregation improvement (was: Object hash and sort aggregation improvement) > Aggregation improvement > --- > > Key: SPARK-34287 > URL: https://issues.apache.org/jira/browse/SPARK-34287 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Creating this umbrella Jira to track overall progress for object hash > aggregation and sort aggregation improvement in execution area. See each > individual sub-task for details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37455) Replace hash with sort aggregate if child is already sorted
Cheng Su created SPARK-37455: Summary: Replace hash with sort aggregate if child is already sorted Key: SPARK-37455 URL: https://issues.apache.org/jira/browse/SPARK-37455 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su In the query plan, if the child of hash aggregate is already sorted on group-by columns, we can replace hash aggregate with sort aggregate for better performance, as sort aggregate does not have hashing overhead of hash aggregate. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34287) Object hash and sort aggregation improvement
[ https://issues.apache.org/jira/browse/SPARK-34287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34287: - Summary: Object hash and sort aggregation improvement (was: Object hash and sort aggregation execution improvement) > Object hash and sort aggregation improvement > > > Key: SPARK-34287 > URL: https://issues.apache.org/jira/browse/SPARK-34287 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Creating this umbrella Jira to track overall progress for object hash > aggregation and sort aggregation improvement in execution area. See each > individual sub-task for details. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37370) Add SQL configs to control newly added join code-gen in 3.3
Cheng Su created SPARK-37370: Summary: Add SQL configs to control newly added join code-gen in 3.3 Key: SPARK-37370 URL: https://issues.apache.org/jira/browse/SPARK-37370 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su During Spark 3.3, we added code-gen for FULL OUTER shuffled hash join, FULL OUTER sort merge join, and Existence sort merge join. Given the join test coverage is not high, and we would love to avoid any upcoming release regression due to it. So here we introduce three internal configs to allow users and developers to disable code-gen in case we found any bug after release. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37366) Add benchmark for Z-order
Cheng Su created SPARK-37366: Summary: Add benchmark for Z-order Key: SPARK-37366 URL: https://issues.apache.org/jira/browse/SPARK-37366 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su We should add a benchmark of Z-order for Parquet and ORC after other sub-tasks have finished. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37361) Introduce Z-order for efficient data skipping
[ https://issues.apache.org/jira/browse/SPARK-37361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37361: - Description: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference for other systems support Z-order: * Databricks Delta Lake: ** [https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html] ** [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] * Presto: ** [https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] * Impala: ** https://issues.apache.org/jira/browse/IMPALA-8755 * AWS: ** [https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-1/] ** [https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-2/] ** [https://cdn2.hubspot.net/hubfs/392937/Whitepaper/WP/interleaved_keys_v12_1.pdf] was: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference for other systems support Z-order: * Databricks Delta Lake: ** [https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html] ** [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] * Presto: ** [https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] * AWS: ** > Introduce Z-order for efficient data skipping > - > > Key: SPARK-37361 > URL: https://issues.apache.org/jira/browse/SPARK-37361 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This is the umbrella Jira to track the progress of introducing Z-order in > Spark. Z-order enables to sort tuples in a way, to allow efficiently data > skipping for columnar file format (Parquet and ORC). > For query with filter on combination of multiple columns, example: > {code:java} > SELECT * > FROM table > WHERE x = 0 OR y = 0 > {code} > Parquet/ORC cannot skip file/row-groups efficiently when reading, even though > the table is sorted (locally or globally) on any columns. However when table > is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups > efficiently when reading. > We should add the feature in Spark to allow OSS Spark users benefitted in > running these queries. > > Reference for other systems support Z-order: > * Databricks Delta Lake: > ** > [https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html] > ** > [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] > * Presto: > ** > [https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] > * Impala: > ** https://issues.apache.org/jira/browse/IMPALA-8755 > * AWS: > ** > [https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-1/] > ** > [https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-2/] > ** > [https://cdn2.hubspot.net/hubfs/392937/Whitepaper/WP/interleaved_keys_v12_1.pdf] > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache
[jira] [Updated] (SPARK-37361) Introduce Z-order for efficient data skipping
[ https://issues.apache.org/jira/browse/SPARK-37361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37361: - Description: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference for other systems support Z-order: * Databricks Delta Lake: ** [https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html] ** [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] * Presto: ** [https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] * AWS: ** was: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: * Databricks Delta Lake added similar support for Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) * Presto added similar support for Z-order ([https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] ) > Introduce Z-order for efficient data skipping > - > > Key: SPARK-37361 > URL: https://issues.apache.org/jira/browse/SPARK-37361 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This is the umbrella Jira to track the progress of introducing Z-order in > Spark. Z-order enables to sort tuples in a way, to allow efficiently data > skipping for columnar file format (Parquet and ORC). > For query with filter on combination of multiple columns, example: > {code:java} > SELECT * > FROM table > WHERE x = 0 OR y = 0 > {code} > Parquet/ORC cannot skip file/row-groups efficiently when reading, even though > the table is sorted (locally or globally) on any columns. However when table > is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups > efficiently when reading. > We should add the feature in Spark to allow OSS Spark users benefitted in > running these queries. > > Reference for other systems support Z-order: > * Databricks Delta Lake: > ** > [https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html] > ** > [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] > * Presto: > ** > [https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] > * AWS: > ** > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37361) Introduce Z-order for efficient data skipping
[ https://issues.apache.org/jira/browse/SPARK-37361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445584#comment-17445584 ] Cheng Su commented on SPARK-37361: -- Just FYI, I am working on each sub-task now. Thanks. > Introduce Z-order for efficient data skipping > - > > Key: SPARK-37361 > URL: https://issues.apache.org/jira/browse/SPARK-37361 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This is the umbrella Jira to track the progress of introducing Z-order in > Spark. Z-order enables to sort tuples in a way, to allow efficiently data > skipping for columnar file format (Parquet and ORC). > For query with filter on combination of multiple columns, example: > {code:java} > SELECT * > FROM table > WHERE x = 0 OR y = 0 > {code} > Parquet/ORC cannot skip file/row-groups efficiently when reading, even though > the table is sorted (locally or globally) on any columns. However when table > is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups > efficiently when reading. > We should add the feature in Spark to allow OSS Spark users benefitted in > running these queries. > > Reference: > * Databricks Delta Lake added similar support for Z-order > ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], > > [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] > ) > * Presto added similar support for Z-order > ([https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] > ) > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37365) Add ZORDER BY syntax and plan rule
Cheng Su created SPARK-37365: Summary: Add ZORDER BY syntax and plan rule Key: SPARK-37365 URL: https://issues.apache.org/jira/browse/SPARK-37365 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su We can introduce a special syntax in Spark syntax `ZORDER BY (x, y, ...)` to allow a better interface to users. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37364) Add code-gen evaluation for Z-order expression
Cheng Su created SPARK-37364: Summary: Add code-gen evaluation for Z-order expression Key: SPARK-37364 URL: https://issues.apache.org/jira/browse/SPARK-37364 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Environment: This Jira is to add code-gen support for Z-order expression. Reporter: Cheng Su -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37361) Introduce Z-order for efficient data skipping
[ https://issues.apache.org/jira/browse/SPARK-37361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37361: - Description: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: * Databricks Delta Lake added similar support for Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) * Presto added similar support for Z-order ([https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] ) was: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: Databricks Delta Lake added similar support with Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) > Introduce Z-order for efficient data skipping > - > > Key: SPARK-37361 > URL: https://issues.apache.org/jira/browse/SPARK-37361 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This is the umbrella Jira to track the progress of introducing Z-order in > Spark. Z-order enables to sort tuples in a way, to allow efficiently data > skipping for columnar file format (Parquet and ORC). > For query with filter on combination of multiple columns, example: > {code:java} > SELECT * > FROM table > WHERE x = 0 OR y = 0 > {code} > Parquet/ORC cannot skip file/row-groups efficiently when reading, even though > the table is sorted (locally or globally) on any columns. However when table > is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups > efficiently when reading. > We should add the feature in Spark to allow OSS Spark users benefitted in > running these queries. > > Reference: > * Databricks Delta Lake added similar support for Z-order > ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], > > [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] > ) > * Presto added similar support for Z-order > ([https://github.com/prestodb/presto/blob/master/presto-hive-common/src/main/java/com/facebook/presto/hive/zorder/ZOrder.java] > ) > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37363) Support string type in Z-order expression
Cheng Su created SPARK-37363: Summary: Support string type in Z-order expression Key: SPARK-37363 URL: https://issues.apache.org/jira/browse/SPARK-37363 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This Jira is to support string type in Z-order expression. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37362) Support float type in Z-order expression
Cheng Su created SPARK-37362: Summary: Support float type in Z-order expression Key: SPARK-37362 URL: https://issues.apache.org/jira/browse/SPARK-37362 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This Jira is to support float type in Z-order expression. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31585) Support Z-order curve expression
[ https://issues.apache.org/jira/browse/SPARK-31585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-31585: - Summary: Support Z-order curve expression (was: Support Z-order curve) > Support Z-order curve expression > > > Key: SPARK-31585 > URL: https://issues.apache.org/jira/browse/SPARK-31585 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.1.0 >Reporter: Yuming Wang >Priority: Major > Attachments: lexicalorder-AUCT_END_DT.png, lexicalorder.png, > zorder-AUCT_END_DT.png, zorder.png > > > Z-ordering is a technique that allows you to map multidimensional data to a > single dimension. We can use this feature to improve query performance. > More details: > https://en.wikipedia.org/wiki/Z-order_curve > https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-1/ > https://aws.amazon.com/blogs/aws/quickly-filter-data-in-amazon-redshift-using-interleaved-sorting/ > https://docs.oracle.com/database/121/DWHSG/attcluster.htm#DWHSG-GUID-7B007A3C-53C2-4437-9E71-9ECECF8B4FAB > Benchmark result: > These. 2 tables ordered and z-ordered by AUCT_END_DT, AUCT_START_DT. > Filter on the AUCT_START_DT column: > !zorder.png! > !lexicalorder.png! > Filter on the auct_end_dt column: > !zorder-AUCT_END_DT.png! > !lexicalorder-AUCT_END_DT.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31585) Support Z-order curve expression
[ https://issues.apache.org/jira/browse/SPARK-31585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-31585: - Parent: SPARK-37361 Issue Type: Sub-task (was: New Feature) > Support Z-order curve expression > > > Key: SPARK-31585 > URL: https://issues.apache.org/jira/browse/SPARK-31585 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Yuming Wang >Priority: Major > Attachments: lexicalorder-AUCT_END_DT.png, lexicalorder.png, > zorder-AUCT_END_DT.png, zorder.png > > > Z-ordering is a technique that allows you to map multidimensional data to a > single dimension. We can use this feature to improve query performance. > More details: > https://en.wikipedia.org/wiki/Z-order_curve > https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-1/ > https://aws.amazon.com/blogs/aws/quickly-filter-data-in-amazon-redshift-using-interleaved-sorting/ > https://docs.oracle.com/database/121/DWHSG/attcluster.htm#DWHSG-GUID-7B007A3C-53C2-4437-9E71-9ECECF8B4FAB > Benchmark result: > These. 2 tables ordered and z-ordered by AUCT_END_DT, AUCT_START_DT. > Filter on the AUCT_START_DT column: > !zorder.png! > !lexicalorder.png! > Filter on the auct_end_dt column: > !zorder-AUCT_END_DT.png! > !lexicalorder-AUCT_END_DT.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37361) Introduce Z-order for efficient data skipping
[ https://issues.apache.org/jira/browse/SPARK-37361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37361: - Description: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: Databricks Delta Lake added similar support with Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) was: This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: Databricks Delta Lake added similar support with Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) > Introduce Z-order for efficient data skipping > - > > Key: SPARK-37361 > URL: https://issues.apache.org/jira/browse/SPARK-37361 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This is the umbrella Jira to track the progress of introducing Z-order in > Spark. Z-order enables to sort tuples in a way, to allow efficiently data > skipping for columnar file format (Parquet and ORC). > For query with filter on combination of multiple columns, example: > {code:java} > SELECT * > FROM table > WHERE x = 0 OR y = 0 > {code} > Parquet/ORC cannot skip file/row-groups efficiently when reading, even though > the table is sorted (locally or globally) on any columns. However when table > is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups > efficiently when reading. > We should add the feature in Spark to allow OSS Spark users benefitted in > running these queries. > > Reference: > Databricks Delta Lake added similar support with Z-order > ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], > > [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] > ) > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37361) Introduce Z-order for efficient data skipping
Cheng Su created SPARK-37361: Summary: Introduce Z-order for efficient data skipping Key: SPARK-37361 URL: https://issues.apache.org/jira/browse/SPARK-37361 Project: Spark Issue Type: Umbrella Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This is the umbrella Jira to track the progress of introducing Z-order in Spark. Z-order enables to sort tuples in a way, to allow efficiently data skipping for columnar file format (Parquet and ORC). For query with filter on combination of multiple columns, example: {code:java} SELECT * FROM table WHERE x = 0 OR y = 0 {code} Parquet/ORC cannot skip file/row-groups efficiently when reading, even though the table is sorted (locally or globally) on any columns. However when table is Z-order sorted on multiple columns, Parquet/ORC can skip file/row-groups efficiently when reading. We should add the feature in Spark to allow OSS Spark users benefitted in running these queries. Reference: Databricks Delta Lake added similar support with Z-order ([https://databricks.com/blog/2018/07/31/processing-petabytes-of-data-in-seconds-with-databricks-delta.html], [https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-optimize.html] ) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37341) Avoid unnecessary buffer and copy in full outer sort merge join
Cheng Su created SPARK-37341: Summary: Avoid unnecessary buffer and copy in full outer sort merge join Key: SPARK-37341 URL: https://issues.apache.org/jira/browse/SPARK-37341 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su FULL OUTER sort merge join (non-code-gen path) copies join keys and buffers input rows, even when rows from both sides do have matched keys ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L1637-L1641] ). This is unnecessary, as we can just output the row with smaller join keys, and only buffer when both sides have matched keys. This would save us from unnecessary copy and buffer, when both join sides have a lot of rows not matched with each other. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37316) Add code-gen for existence sort merge join
[ https://issues.apache.org/jira/browse/SPARK-37316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443227#comment-17443227 ] Cheng Su commented on SPARK-37316: -- Will raise a PR soon after https://issues.apache.org/jira/browse/SPARK-35352 is done. > Add code-gen for existence sort merge join > -- > > Key: SPARK-37316 > URL: https://issues.apache.org/jira/browse/SPARK-37316 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for existence sort > merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37316) Add code-gen for existence sort merge join
Cheng Su created SPARK-37316: Summary: Add code-gen for existence sort merge join Key: SPARK-37316 URL: https://issues.apache.org/jira/browse/SPARK-37316 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This Jira is to track the progress to add code-gen support for existence sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35352) Add code-gen for full outer sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-35352: - Affects Version/s: 3.3.0 > Add code-gen for full outer sort merge join > --- > > Key: SPARK-35352 > URL: https://issues.apache.org/jira/browse/SPARK-35352 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0, 3.3.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for full outer > sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37223) Fix unit test check in JoinHintSuite
[ https://issues.apache.org/jira/browse/SPARK-37223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37223: - Issue Type: Improvement (was: Bug) > Fix unit test check in JoinHintSuite > > > Key: SPARK-37223 > URL: https://issues.apache.org/jira/browse/SPARK-37223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Trivial > > This is to fix the unit test where we should assert on the content of log in > `JoinHintSuite`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37223) Fix unit test check in JoinHintSuite
[ https://issues.apache.org/jira/browse/SPARK-37223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-37223: - Issue Type: Test (was: Improvement) > Fix unit test check in JoinHintSuite > > > Key: SPARK-37223 > URL: https://issues.apache.org/jira/browse/SPARK-37223 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Trivial > > This is to fix the unit test where we should assert on the content of log in > `JoinHintSuite`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37223) Fix unit test check in JoinHintSuite
Cheng Su created SPARK-37223: Summary: Fix unit test check in JoinHintSuite Key: SPARK-37223 URL: https://issues.apache.org/jira/browse/SPARK-37223 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su This is to fix the unit test where we should assert on the content of log in `JoinHintSuite`. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37220) Do not split input file for Parquet reader with aggregate push down
Cheng Su created SPARK-37220: Summary: Do not split input file for Parquet reader with aggregate push down Key: SPARK-37220 URL: https://issues.apache.org/jira/browse/SPARK-37220 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su As a followup of [https://github.com/apache/spark/pull/34298/files#r734795801,] Similar to ORC aggregate push down, we can disallow split input files for Parquet reader as well. See original comment for motivation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37167) Add benchmark for aggregate push down
[ https://issues.apache.org/jira/browse/SPARK-37167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17439569#comment-17439569 ] Cheng Su commented on SPARK-37167: -- Just FYI, I am working on it. > Add benchmark for aggregate push down > - > > Key: SPARK-37167 > URL: https://issues.apache.org/jira/browse/SPARK-37167 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > As we added aggregate push down for Parquet and ORC, let's also add a micro > benchmark for both file formats, similar to filter push down and nested > schema pruning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37167) Add benchmark for aggregate push down
Cheng Su created SPARK-37167: Summary: Add benchmark for aggregate push down Key: SPARK-37167 URL: https://issues.apache.org/jira/browse/SPARK-37167 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su As we added aggregate push down for Parquet and ORC, let's also add a micro benchmark for both file formats, similar to filter push down and nested schema pruning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37001) Disable two level of map for final hash aggregation by default
Cheng Su created SPARK-37001: Summary: Disable two level of map for final hash aggregation by default Key: SPARK-37001 URL: https://issues.apache.org/jira/browse/SPARK-37001 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0, 3.3.0 Reporter: Cheng Su This JIRA is to disable two level of maps for final hash aggregation by default. The feature was introduced in [#32242|https://github.com/apache/spark/pull/32242] and we found it can lead to query performance regression when the final aggregation gets rows with a lot of distinct keys. The 1st level hash map is full so a lot of rows will waste the 1st hash map lookup and inserted into 2nd hash map. This feature still benefits query with not so many distinct keys though, so introducing a config to allow query to enable the feature when seeing benefit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36794) Ignore duplicated join keys when building relation for SEMI/ANTI hash join
[ https://issues.apache.org/jira/browse/SPARK-36794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-36794: - Summary: Ignore duplicated join keys when building relation for SEMI/ANTI hash join (was: Ignore duplicated join keys when building relation for LEFT/ANTI hash join) > Ignore duplicated join keys when building relation for SEMI/ANTI hash join > -- > > Key: SPARK-36794 > URL: https://issues.apache.org/jira/browse/SPARK-36794 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > For LEFT SEMI and LEFT ANTI hash equi-join without extra join condition, we > only need to keep one row per unique join key(s) inside hash table > (`HashedRelation`) when building the hash table. This can help reduce the > size of hash table of join. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36794) Ignore duplicated join keys when building relation for LEFT/ANTI hash join
Cheng Su created SPARK-36794: Summary: Ignore duplicated join keys when building relation for LEFT/ANTI hash join Key: SPARK-36794 URL: https://issues.apache.org/jira/browse/SPARK-36794 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su For LEFT SEMI and LEFT ANTI hash equi-join without extra join condition, we only need to keep one row per unique join key(s) inside hash table (`HashedRelation`) when building the hash table. This can help reduce the size of hash table of join. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36652) AQE dynamic join selection should not apply to non-equi join
[ https://issues.apache.org/jira/browse/SPARK-36652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-36652: - Affects Version/s: (was: 3.2.0) > AQE dynamic join selection should not apply to non-equi join > > > Key: SPARK-36652 > URL: https://issues.apache.org/jira/browse/SPARK-36652 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Cheng Su >Priority: Minor > > Currently `DynamicJoinSelection` has two features: 1.demote broadcast hash > join, and 2.promote shuffled hash join. Both are achieved by adding join hint > in query plan, and only works for equi join. However the rule is matching > with `Join` operator now - > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala#L71,] > so it would add hint for non-equi join by mistake. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36652) AQE dynamic join selection should not apply to non-equi join
Cheng Su created SPARK-36652: Summary: AQE dynamic join selection should not apply to non-equi join Key: SPARK-36652 URL: https://issues.apache.org/jira/browse/SPARK-36652 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0, 3.3.0 Reporter: Cheng Su Currently `DynamicJoinSelection` has two features: 1.demote broadcast hash join, and 2.promote shuffled hash join. Both are achieved by adding join hint in query plan, and only works for equi join. However the rule is matching with `Join` operator now - [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala#L71,] so it would add hint for non-equi join by mistake. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36612) Support left outer join build left or right outer join build right in shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-36612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407823#comment-17407823 ] Cheng Su commented on SPARK-36612: -- I agree some queries do fit in this scenario. We can save the sort before join for these queries if we are able to do shuffled hash join on it, instead of sort merge join. I don't think it solves the AQE skew problem though. We still cannot split the skewed partition from the right side of LEFT OUTER join, because across multiple tasks, they don't have common knowledge of which rows are matched or not during runtime. > Support left outer join build left or right outer join build right in > shuffled hash join > > > Key: SPARK-36612 > URL: https://issues.apache.org/jira/browse/SPARK-36612 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: mcdull_zhang >Priority: Major > > Currently spark sql does not support build left side when left outer join (or > build right side when right outer join). > However, in our production environment, there are a large number of scenarios > where small tables are left join large tables, and many times, large tables > have data skew (currently AQE can't handle this kind of skew). > Inspired by SPARK-32399, we can use similar ideas to realize left outer join > build left. > I think this treatment is very meaningful, but I don’t know how members > consider this matter? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36594) ORC vectorized reader should properly check maximal number of fields
Cheng Su created SPARK-36594: Summary: ORC vectorized reader should properly check maximal number of fields Key: SPARK-36594 URL: https://issues.apache.org/jira/browse/SPARK-36594 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0, 3.3.0 Reporter: Cheng Su Debugged internally and found a bug where we should disable vectorized reader now based on schema recursively. Currently we check `schema.length` to be no more than `wholeStageMaxNumFields` to enable vectorization. `schema.length` does not take nested columns sub-fields into condition (i.e. view nested column same as primitive column). This check will be wrong when enabling vectorization for nested columns. We should follow same check from `WholeStageCodegenExec` to check sub-fields recursively. This will not cause correctness issue but will cause performance issue where we may enable vectorization for nested columns by mistake when nested column has a lot of sub-fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36404) Support nested columns in ORC vectorized reader for data source v2
Cheng Su created SPARK-36404: Summary: Support nested columns in ORC vectorized reader for data source v2 Key: SPARK-36404 URL: https://issues.apache.org/jira/browse/SPARK-36404 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0, 3.3.0 Reporter: Cheng Su We added support of nested columns in ORC vectorized reader for data source v1. Data source v2 and v1 both use same underlying implementation for vectorized reader (OrcColumnVector), so we can support data source v2 as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36269) Fix only set data columns to Hive column names config
Cheng Su created SPARK-36269: Summary: Fix only set data columns to Hive column names config Key: SPARK-36269 URL: https://issues.apache.org/jira/browse/SPARK-36269 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Cheng Su When reading Hive table, we set the Hive column id and column name configs (`hive.io.file.readcolumn.ids` and `hive.io.file.readcolumn.names`). We should set non-partition columns (data columns) for both configs, as Spark always appends partition columns in its own reader - [https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L240] . The column id config has only non-partition columns, but column name config has both partition and non-partition columns. We should keep them to be consistent with only non-partition columns. This does not cause issue for public OSS Hive file format, but for customized internal Hive file format, it causes the issue as we are expecting these two configs to be same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24528) Add support to read multiple sorted bucket files for data source v1
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375215#comment-17375215 ] Cheng Su commented on SPARK-24528: -- Hi [~rahij] - glad to hear that the PR is working fine on your fork. Yes it's still on my plan, but was busy with other JIRAs in past months. In this week, let me rebase the PR to latest master and collect reviewer's feedback, thanks. > Add support to read multiple sorted bucket files for data source v1 > --- > > Key: SPARK-24528 > URL: https://issues.apache.org/jira/browse/SPARK-24528 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Ohad Raviv >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-24528#Closely related to > SPARK-24410, we're trying to optimize a very common use case we have of > getting the most updated row by id from a fact table. > We're saving the table bucketed to skip the shuffle stage, but we're still > "waste" time on the Sort operator evethough the data is already sorted. > here's a good example: > {code:java} > sparkSession.range(N).selectExpr( > "id as key", > "id % 2 as t1", > "id % 3 as t2") > .repartition(col("key")) > .write > .mode(SaveMode.Overwrite) > .bucketBy(3, "key") > .sortBy("key", "t1") > .saveAsTable("a1"){code} > {code:java} > sparkSession.sql("select max(struct(t1, *)) from a1 group by key").explain > == Physical Plan == > SortAggregate(key=[key#24L], functions=[max(named_struct(t1, t1#25L, key, > key#24L, t1, t1#25L, t2, t2#26L))]) > +- SortAggregate(key=[key#24L], functions=[partial_max(named_struct(t1, > t1#25L, key, key#24L, t1, t1#25L, t2, t2#26L))]) > +- *(1) FileScan parquet default.a1[key#24L,t1#25L,t2#26L] Batched: true, > Format: Parquet, Location: ...{code} > > and here's a bad example, but more realistic: > {code:java} > sparkSession.sql("set spark.sql.shuffle.partitions=2") > sparkSession.sql("select max(struct(t1, *)) from a1 group by key").explain > == Physical Plan == > SortAggregate(key=[key#32L], functions=[max(named_struct(t1, t1#33L, key, > key#32L, t1, t1#33L, t2, t2#34L))]) > +- SortAggregate(key=[key#32L], functions=[partial_max(named_struct(t1, > t1#33L, key, key#32L, t1, t1#33L, t2, t2#34L))]) > +- *(1) Sort [key#32L ASC NULLS FIRST], false, 0 > +- *(1) FileScan parquet default.a1[key#32L,t1#33L,t2#34L] Batched: true, > Format: Parquet, Location: ... > {code} > > I've traced the problem to DataSourceScanExec#235: > {code:java} > val sortOrder = if (sortColumns.nonEmpty) { > // In case of bucketing, its possible to have multiple files belonging to > the > // same bucket in a given relation. Each of these files are locally sorted > // but those files combined together are not globally sorted. Given that, > // the RDD partition will not be sorted even if the relation has sort > columns set > // Current solution is to check if all the buckets have a single file in it > val files = selectedPartitions.flatMap(partition => partition.files) > val bucketToFilesGrouping = > files.map(_.getPath.getName).groupBy(file => > BucketingUtils.getBucketId(file)) > val singleFilePartitions = bucketToFilesGrouping.forall(p => p._2.length <= > 1){code} > so obviously the code avoids dealing with this situation now.. > could you think of a way to solve this or bypass it? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35965) Add documentation for ORC nested column vectorized reader
Cheng Su created SPARK-35965: Summary: Add documentation for ORC nested column vectorized reader Key: SPARK-35965 URL: https://issues.apache.org/jira/browse/SPARK-35965 Project: Spark Issue Type: Documentation Components: docs, SQL Affects Versions: 3.2.0 Reporter: Cheng Su In https://issues.apache.org/jira/browse/SPARK-34862, we added support for ORC nested column vectorized reader, and it is disabled by default for now. So we would like to add the user-facing documentation for it, and user can opt-in to use it if they want. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19256) Hive bucketing write support
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-19256: - Affects Version/s: 3.2.0 > Hive bucketing write support > > > Key: SPARK-19256 > URL: https://issues.apache.org/jira/browse/SPARK-19256 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0, 3.0.0, 3.1.0, 3.2.0 >Reporter: Tejas Patil >Priority: Minor > > Update (2020 by Cheng Su): > We use this JIRA to track progress for Hive bucketing write support in Spark. > The goal is for Spark to write Hive bucketed table, to be compatible with > other compute engines (Hive and Presto). > > Current status for Hive bucketed table in Spark: > Not support for reading Hive bucketed table: read bucketed table as > non-bucketed table. > Wrong behavior for writing Hive ORC and Parquet bucketed table: write > orc/parquet bucketed table as non-bucketed table (code path: > InsertIntoHadoopFsRelationCommand -> FileFormatWriter). > Do not allow for writing Hive non-ORC/Parquet bucketed table: throw exception > by default if writing non-orc/parquet bucketed table (code path: > InsertIntoHiveTable), and exception can be disabled by setting config > `hive.enforce.bucketing`=false and `hive.enforce.sorting`=false, which will > write as non-bucketed table. > > Current status for Hive bucketed table in Hive: > Hive 3.0.0 and after: support writing bucketed table with Hive murmur3hash > (https://issues.apache.org/jira/browse/HIVE-18910). > Hive 1.x.y and 2.x.y: support writing bucketed table with Hive hivehash. > Hive on Tez: support zero and multiple files per bucket > (https://issues.apache.org/jira/browse/HIVE-14014). And more code pointer on > read path - > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java#L183-L212] > . > > Current status for Hive bucketed table in Presto (take presto-sql here): > Support writing bucketed table with Hive murmur3hash and hivehash > ([https://github.com/prestosql/presto/pull/1697]). > Support zero and multiple files per bucket > ([https://github.com/prestosql/presto/pull/822]). > > TLDR is to achieve Hive bucketed table compatibility across Spark, Presto and > Hive. Here with this JIRA, we need to add support writing Hive bucketed table > with Hive murmur3hash (for Hive 3.x.y) and hivehash (for Hive 1.x.y and > 2.x.y). > > To allow Spark efficiently read Hive bucketed table, this needs more radical > change and we decide to wait until data source v2 supports bucketing, and do > the read path on data source v2. Read path will not covered by this JIRA. > > Original description (2017 by Tejas Patil): > JIRA to track design discussions and tasks related to Hive bucketing support > in Spark. > Proposal : > [https://docs.google.com/document/d/1a8IDh23RAkrkg9YYAeO51F4aGO8-xAlupKwdshve2fc/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19256) Hive bucketing write support
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368522#comment-17368522 ] Cheng Su commented on SPARK-19256: -- [~pushkarcse] - we are currently working on https://issues.apache.org/jira/browse/SPARK-33298 . After that, I will resume the discussion of https://issues.apache.org/jira/browse/SPARK-32709 . > Hive bucketing write support > > > Key: SPARK-19256 > URL: https://issues.apache.org/jira/browse/SPARK-19256 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0, 3.0.0, 3.1.0 >Reporter: Tejas Patil >Priority: Minor > > Update (2020 by Cheng Su): > We use this JIRA to track progress for Hive bucketing write support in Spark. > The goal is for Spark to write Hive bucketed table, to be compatible with > other compute engines (Hive and Presto). > > Current status for Hive bucketed table in Spark: > Not support for reading Hive bucketed table: read bucketed table as > non-bucketed table. > Wrong behavior for writing Hive ORC and Parquet bucketed table: write > orc/parquet bucketed table as non-bucketed table (code path: > InsertIntoHadoopFsRelationCommand -> FileFormatWriter). > Do not allow for writing Hive non-ORC/Parquet bucketed table: throw exception > by default if writing non-orc/parquet bucketed table (code path: > InsertIntoHiveTable), and exception can be disabled by setting config > `hive.enforce.bucketing`=false and `hive.enforce.sorting`=false, which will > write as non-bucketed table. > > Current status for Hive bucketed table in Hive: > Hive 3.0.0 and after: support writing bucketed table with Hive murmur3hash > (https://issues.apache.org/jira/browse/HIVE-18910). > Hive 1.x.y and 2.x.y: support writing bucketed table with Hive hivehash. > Hive on Tez: support zero and multiple files per bucket > (https://issues.apache.org/jira/browse/HIVE-14014). And more code pointer on > read path - > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java#L183-L212] > . > > Current status for Hive bucketed table in Presto (take presto-sql here): > Support writing bucketed table with Hive murmur3hash and hivehash > ([https://github.com/prestosql/presto/pull/1697]). > Support zero and multiple files per bucket > ([https://github.com/prestosql/presto/pull/822]). > > TLDR is to achieve Hive bucketed table compatibility across Spark, Presto and > Hive. Here with this JIRA, we need to add support writing Hive bucketed table > with Hive murmur3hash (for Hive 3.x.y) and hivehash (for Hive 1.x.y and > 2.x.y). > > To allow Spark efficiently read Hive bucketed table, this needs more radical > change and we decide to wait until data source v2 supports bucketing, and do > the read path on data source v2. Read path will not covered by this JIRA. > > Original description (2017 by Tejas Patil): > JIRA to track design discussions and tasks related to Hive bucketing support > in Spark. > Proposal : > [https://docs.google.com/document/d/1a8IDh23RAkrkg9YYAeO51F4aGO8-xAlupKwdshve2fc/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35794) Allow custom plugin for AQE cost evaluator
Cheng Su created SPARK-35794: Summary: Allow custom plugin for AQE cost evaluator Key: SPARK-35794 URL: https://issues.apache.org/jira/browse/SPARK-35794 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su Current AQE has cost evaluator to decide whether to use new plan after replanning. The current used evaluator is `SimpleCostEvaluator` to make decision based on number of shuffle in the query plan. This is not perfect cost evaluator, and different production environments might want to use different custom evaluators. E.g., sometimes we might want to still do skew join even though it might introduce extra shuffle (trade off resource for better latency), sometimes we might want to take sort into consideration for cost as well. So We want to make the cost evaluator to be plugable, and developers can implement their own `CostEvaluator` subclass and plug in dynamically based on configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35791) Release on-going map properly for NULL-aware ANTI join
Cheng Su created SPARK-35791: Summary: Release on-going map properly for NULL-aware ANTI join Key: SPARK-35791 URL: https://issues.apache.org/jira/browse/SPARK-35791 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su NULL-aware ANTI join (https://issues.apache.org/jira/browse/SPARK-32290) detects NULL join keys during building the map for `HashedRelation`, and will immediately return `HashedRelationWithAllNullKeys` without taking care of the map built already. Before returning `HashedRelationWithAllNullKeys`, the map needs to be freed properly to save memory and keep memory accounting correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35760) Fix the max rows check for broadcast exchange
Cheng Su created SPARK-35760: Summary: Fix the max rows check for broadcast exchange Key: SPARK-35760 URL: https://issues.apache.org/jira/browse/SPARK-35760 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This is to fix the maximal allowed number of rows check in `BroadcastExchangeExec`. After [https://github.com/apache/spark/pull/27828,] the max number of rows is calculated based on max capacity of `BytesToBytesMap` (previous value before the PR is 51200). This calculation is not accurate as only `UnsafeHashedRelation` is using `BytesToBytesMap`. `LongHashedRelation` (used for broadcast join on key with long data type) has limit of 51200 ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala#L584] ), and `BroadcastNestedLoopJoinExec` is not depending on `HashedRelation` at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32709) Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2)
[ https://issues.apache.org/jira/browse/SPARK-32709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360287#comment-17360287 ] Cheng Su commented on SPARK-32709: -- [~spedamallu] - yes I am still working on it. It's currently depending on https://issues.apache.org/jira/browse/SPARK-33298, let me speed up discussion of that one first. Thanks. > Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2) > -- > > Key: SPARK-32709 > URL: https://issues.apache.org/jira/browse/SPARK-32709 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Cheng Su >Priority: Minor > > Hive ORC/Parquet write code path is same as data source v1 code path > (FileFormatWriter). This JIRA is to add the support to write Hive ORC/Parquet > bucketed table with hivehash. The change is to custom `bucketIdExpression` to > use hivehash when the table is Hive bucketed table, and the Hive version is > 1.x.y or 2.x.y. > > This will allow us write Hive/Presto-compatible bucketed table for Hive 1 and > 2. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35693) Add plan check for stream-stream join unit test
Cheng Su created SPARK-35693: Summary: Add plan check for stream-stream join unit test Key: SPARK-35693 URL: https://issues.apache.org/jira/browse/SPARK-35693 Project: Spark Issue Type: Test Components: Structured Streaming Affects Versions: 3.2.0 Reporter: Cheng Su The unit test of [https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala#L566] was introduce in [https://github.com/apache/spark/pull/21587,] to fix the planner side of thing for stream-stream join. Ideally check the query result should capture the bug, but it would be better to add plan check to make the purpose of unit test more clearly and capture future bug from planner change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35690) Stream-stream join keys should be reordered properly
[ https://issues.apache.org/jira/browse/SPARK-35690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-35690: - Issue Type: Improvement (was: Documentation) > Stream-stream join keys should be reordered properly > > > Key: SPARK-35690 > URL: https://issues.apache.org/jira/browse/SPARK-35690 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > For query with multiple stream-stream joins, unnecessary shuffle can be > happened as we don't reorder join keys for stream-stream join. We already > reorder join keys for shuffled hash join and sort merge join, so we can also > have it for stream-stream join - > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala#L232] > . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35690) Stream-stream join keys should be reordered properly
Cheng Su created SPARK-35690: Summary: Stream-stream join keys should be reordered properly Key: SPARK-35690 URL: https://issues.apache.org/jira/browse/SPARK-35690 Project: Spark Issue Type: Documentation Components: Structured Streaming Affects Versions: 3.2.0 Reporter: Cheng Su For query with multiple stream-stream joins, unnecessary shuffle can be happened as we don't reorder join keys for stream-stream join. We already reorder join keys for shuffled hash join and sort merge join, so we can also have it for stream-stream join - [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala#L232] . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35604) Fix condition check for FULL OUTER sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-35604: - Issue Type: Improvement (was: Documentation) > Fix condition check for FULL OUTER sort merge join > -- > > Key: SPARK-35604 > URL: https://issues.apache.org/jira/browse/SPARK-35604 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Assignee: Cheng Su >Priority: Minor > Fix For: 3.2.0 > > > The condition check for FULL OUTER sort merge join > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L1368] > ) has unnecessary trip when `leftIndex == leftMatches.size` or `rightIndex > == rightMatches.size`. Though this does not affect correctness > (`scanNextInBuffered()` returns false anyway). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35604) Fix condition check for FULL OUTER sort merge join
Cheng Su created SPARK-35604: Summary: Fix condition check for FULL OUTER sort merge join Key: SPARK-35604 URL: https://issues.apache.org/jira/browse/SPARK-35604 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su The condition check for FULL OUTER sort merge join ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L1368] ) has unnecessary trip when `leftIndex == leftMatches.size` or `rightIndex == rightMatches.size`. Though this does not affect correctness (`scanNextInBuffered()` returns false anyway). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35529) Add fallback metrics for hash aggregate
Cheng Su created SPARK-35529: Summary: Add fallback metrics for hash aggregate Key: SPARK-35529 URL: https://issues.apache.org/jira/browse/SPARK-35529 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su Add the metrics to record how many tasks fallback to sort-based aggregation for hash aggregation. This will help developers and users to debug and optimize query. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35438) Minor documentation fix for window physical operator
Cheng Su created SPARK-35438: Summary: Minor documentation fix for window physical operator Key: SPARK-35438 URL: https://issues.apache.org/jira/browse/SPARK-35438 Project: Spark Issue Type: Documentation Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su As title. Fixed two places where the documentation has some error. Help people read code more easily in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35363) Refactor sort merge join code-gen be agnostic to join type
Cheng Su created SPARK-35363: Summary: Refactor sort merge join code-gen be agnostic to join type Key: SPARK-35363 URL: https://issues.apache.org/jira/browse/SPARK-35363 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This is a pre-requisite of [https://github.com/apache/spark/pull/32476,] in discussion of [https://github.com/apache/spark/pull/32476#issuecomment-836469779] . This is to refactor sort merge join code-gen to depend on streamed/buffered terminology, which makes the code-gen agnostic to different join types and can be extended to support other join types than inner join. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35354) Minor cleanup to replace BaseJoinExec with ShuffledJoin in CoalesceBucketsInJoin
Cheng Su created SPARK-35354: Summary: Minor cleanup to replace BaseJoinExec with ShuffledJoin in CoalesceBucketsInJoin Key: SPARK-35354 URL: https://issues.apache.org/jira/browse/SPARK-35354 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su As title. We should use a more restrictive interface `ShuffledJoin` other than `BaseJoinExec` in `CoalesceBucketsInJoin`, as the rule only applies to sort merge join and shuffled hash join. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35352) Add code-gen for full outer sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341226#comment-17341226 ] Cheng Su commented on SPARK-35352: -- Will raise a PR soon. > Add code-gen for full outer sort merge join > --- > > Key: SPARK-35352 > URL: https://issues.apache.org/jira/browse/SPARK-35352 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for full outer > sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-35351) Add code-gen for left anti sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341225#comment-17341225 ] Cheng Su edited comment on SPARK-35351 at 5/8/21, 7:28 AM: --- Will raise a PR soon. was (Author: chengsu): Will raise a PR soon. [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13377389] > Add code-gen for left anti sort merge join > -- > > Key: SPARK-35351 > URL: https://issues.apache.org/jira/browse/SPARK-35351 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for left anti sort > merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35350) Add code-gen for left semi sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341224#comment-17341224 ] Cheng Su commented on SPARK-35350: -- Will raise a PR soon. > Add code-gen for left semi sort merge join > -- > > Key: SPARK-35350 > URL: https://issues.apache.org/jira/browse/SPARK-35350 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for left semi sort > merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35352) Add code-gen for full outer sort merge join
Cheng Su created SPARK-35352: Summary: Add code-gen for full outer sort merge join Key: SPARK-35352 URL: https://issues.apache.org/jira/browse/SPARK-35352 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This Jira is to track the progress to add code-gen support for full outer sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35351) Add code-gen for left anti sort merge join
[ https://issues.apache.org/jira/browse/SPARK-35351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341225#comment-17341225 ] Cheng Su commented on SPARK-35351: -- Will raise a PR soon. [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13377389] > Add code-gen for left anti sort merge join > -- > > Key: SPARK-35351 > URL: https://issues.apache.org/jira/browse/SPARK-35351 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > This Jira is to track the progress to add code-gen support for left anti sort > merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35351) Add code-gen for left anti sort merge join
Cheng Su created SPARK-35351: Summary: Add code-gen for left anti sort merge join Key: SPARK-35351 URL: https://issues.apache.org/jira/browse/SPARK-35351 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This Jira is to track the progress to add code-gen support for left anti sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35350) Add code-gen for left semi sort merge join
Cheng Su created SPARK-35350: Summary: Add code-gen for left semi sort merge join Key: SPARK-35350 URL: https://issues.apache.org/jira/browse/SPARK-35350 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This Jira is to track the progress to add code-gen support for left semi sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34705) Add code-gen for all join types of sort merge join
[ https://issues.apache.org/jira/browse/SPARK-34705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341219#comment-17341219 ] Cheng Su commented on SPARK-34705: -- Just FYI, we are working on each sub-tasks now. Target date is before Spark 3.2.0 release. Thanks. > Add code-gen for all join types of sort merge join > -- > > Key: SPARK-34705 > URL: https://issues.apache.org/jira/browse/SPARK-34705 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Currently sort merge join only supports inner join type > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L374] > ). We added code-gen for other join types internally in our fork and saw > obvious CPU performance improvement. Create this Jira to propose to merge > back to upstream. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35349) Add code-gen for left/right outer sort merge join
Cheng Su created SPARK-35349: Summary: Add code-gen for left/right outer sort merge join Key: SPARK-35349 URL: https://issues.apache.org/jira/browse/SPARK-35349 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su This Jira is to track the progress to add code-gen support for left outer / right outer sort merge join. See motivation in SPARK-34705. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34705) Add code-gen for all join types of sort merge join
[ https://issues.apache.org/jira/browse/SPARK-34705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-34705: - Issue Type: Umbrella (was: Improvement) > Add code-gen for all join types of sort merge join > -- > > Key: SPARK-34705 > URL: https://issues.apache.org/jira/browse/SPARK-34705 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Currently sort merge join only supports inner join type > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L374] > ). We added code-gen for other join types internally in our fork and saw > obvious CPU performance improvement. Create this Jira to propose to merge > back to upstream. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34705) Add code-gen for all join types of sort merge join
[ https://issues.apache.org/jira/browse/SPARK-34705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335022#comment-17335022 ] Cheng Su commented on SPARK-34705: -- [~advancedxy] - We saw ~10% CPU performance improvement for targeted queries. I think it makes sense to update the benchmark after feature is merged. > Add code-gen for all join types of sort merge join > -- > > Key: SPARK-34705 > URL: https://issues.apache.org/jira/browse/SPARK-34705 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Currently sort merge join only supports inner join type > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L374] > ). We added code-gen for other join types internally in our fork and saw > obvious CPU performance improvement. Create this Jira to propose to merge > back to upstream. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35241) Investigate to prefer vectorized hash map in hash aggregate selectively
Cheng Su created SPARK-35241: Summary: Investigate to prefer vectorized hash map in hash aggregate selectively Key: SPARK-35241 URL: https://issues.apache.org/jira/browse/SPARK-35241 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su In hash aggregate, we always use row-based hash map as first level hash map in production, and use vectorized hash map in testing / benchmarking. However we do find in micro-benchmark that vectorized hash map is better than row-based hash map e.g. with single key - [https://github.com/apache/spark/pull/32357#discussion_r620914345] . So we should re-evaluate the decision to always use row-based hash map or not. And maybe come up with a more adaptive decision policy to choose which map to use depending on keys / values. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35235) Add row-based fast hash map into aggregate benchmark
Cheng Su created SPARK-35235: Summary: Add row-based fast hash map into aggregate benchmark Key: SPARK-35235 URL: https://issues.apache.org/jira/browse/SPARK-35235 Project: Spark Issue Type: Test Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su `AggregateBenchmark` is only testing the performance for vectorized fast hash map, but not row-based hash map (which is used by default). We should add the row-based hash map into the benchmark. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35133) EXPLAIN CODEGEN does not work with AQE
[ https://issues.apache.org/jira/browse/SPARK-35133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331011#comment-17331011 ] Cheng Su commented on SPARK-35133: -- btw just to provide more context, I am running into this in reality when trying to debug code-gen for some queries in unit test. So I guess others can run into this issue as well. I will spend one afternoon or so to figure out if there's a clean fix. Thanks. > EXPLAIN CODEGEN does not work with AQE > -- > > Key: SPARK-35133 > URL: https://issues.apache.org/jira/browse/SPARK-35133 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Major > > `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the > generated code for each stage of plan. The current implementation is to match > `WholeStageCodegenExec` operator in query plan and prints out generated code > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala#L111-L118] > ). This does not work with AQE as we wrap the whole query plan inside > `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan > rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior > change for EXPLAIN query (and Dataset.explain), as we enable AQE by default > now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35133) EXPLAIN CODEGEN does not work with AQE
[ https://issues.apache.org/jira/browse/SPARK-35133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330982#comment-17330982 ] Cheng Su commented on SPARK-35133: -- When ever developers/users want to debug generated code for query in spark-shell or spark-sql command line, they have to disable AQE explicitly. After debugging, they have to enable AQE back for running queries or doing some other stuff. I feel it's kind of inconvenient for debugging. > EXPLAIN CODEGEN does not work with AQE > -- > > Key: SPARK-35133 > URL: https://issues.apache.org/jira/browse/SPARK-35133 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Major > > `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the > generated code for each stage of plan. The current implementation is to match > `WholeStageCodegenExec` operator in query plan and prints out generated code > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala#L111-L118] > ). This does not work with AQE as we wrap the whole query plan inside > `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan > rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior > change for EXPLAIN query (and Dataset.explain), as we enable AQE by default > now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35179) Introduce hybrid join for sort merge join and shuffled hash join in AQE
[ https://issues.apache.org/jira/browse/SPARK-35179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326991#comment-17326991 ] Cheng Su commented on SPARK-35179: -- Thanks for [~cloud_fan] for the idea. Please comment or edit if this is not captured correctly, thanks. > Introduce hybrid join for sort merge join and shuffled hash join in AQE > --- > > Key: SPARK-35179 > URL: https://issues.apache.org/jira/browse/SPARK-35179 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Per discussion in > [https://github.com/apache/spark/pull/32210#issuecomment-823503243] , we can > introduce some kind of {{HybridJoin}} operator in AQE, and we can choose to > do shuffled hash join vs sort merge join for each task independently, e.g. > based on partition size, task1 can do shuffled hash join, and task2 can do > sort merge join, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32461) Shuffled hash join improvement
[ https://issues.apache.org/jira/browse/SPARK-32461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-32461: - Affects Version/s: 3.2.0 > Shuffled hash join improvement > -- > > Key: SPARK-32461 > URL: https://issues.apache.org/jira/browse/SPARK-32461 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0, 3.2.0 >Reporter: Cheng Su >Priority: Major > Labels: release-notes > > Shuffled hash join avoids sort compared to sort merge join. This advantage > shows up obviously when joining large table in terms of saving CPU and IO (in > case of external sort happens). In latest master trunk, shuffled hash join is > disabled by default with config "spark.sql.join.preferSortMergeJoin"=true, > with favor of reducing risk of OOM. However shuffled hash join could be > improved to a better state (validated in our internal fork). Creating this > Jira to track overall progress. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35179) Introduce hybrid join for sort merge join and shuffled hash join in AQE
Cheng Su created SPARK-35179: Summary: Introduce hybrid join for sort merge join and shuffled hash join in AQE Key: SPARK-35179 URL: https://issues.apache.org/jira/browse/SPARK-35179 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su Per discussion in [https://github.com/apache/spark/pull/32210#issuecomment-823503243] , we can introduce some kind of {{HybridJoin}} operator in AQE, and we can choose to do shuffled hash join vs sort merge join for each task independently, e.g. based on partition size, task1 can do shuffled hash join, and task2 can do sort merge join, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35141) Support two level map for final hash aggregation
Cheng Su created SPARK-35141: Summary: Support two level map for final hash aggregation Key: SPARK-35141 URL: https://issues.apache.org/jira/browse/SPARK-35141 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su For partial hash aggregation (code-gen path), we have two level of hash map for aggregation. First level is from `RowBasedHashMapGenerator`, which is computation faster compared to the second level from `UnsafeFixedWidthAggregationMap`. The introducing of two level hash map can help improve CPU performance of query as the first level hash map normally fits in hardware cache and has cheaper hash function for key lookup. For final hash aggregation, we can also support two level of hash map, to improve query performance further. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35133) EXPLAIN CODEGEN does not work with AQE
[ https://issues.apache.org/jira/browse/SPARK-35133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-35133: - Description: `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the generated code for each stage of plan. The current implementation is to match `WholeStageCodegenExec` operator in query plan and prints out generated code ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala#L111-L118] ). This does not work with AQE as we wrap the whole query plan inside `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior change for EXPLAIN query (and Dataset.explain), as we enable AQE by default now. (was: `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the generated code for each stage of plan. The current implementation is to match `WholeStageCodegenExec` operator in query plan and prints out generated code. This does not work with AQE as we wrap the whole query plan inside `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior change for EXPLAIN query (and Dataset.explain), as we enable AQE by default now.) > EXPLAIN CODEGEN does not work with AQE > -- > > Key: SPARK-35133 > URL: https://issues.apache.org/jira/browse/SPARK-35133 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Major > > `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the > generated code for each stage of plan. The current implementation is to match > `WholeStageCodegenExec` operator in query plan and prints out generated code > ([https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala#L111-L118] > ). This does not work with AQE as we wrap the whole query plan inside > `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan > rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior > change for EXPLAIN query (and Dataset.explain), as we enable AQE by default > now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35133) EXPLAIN CODEGEN does not work with AQE
[ https://issues.apache.org/jira/browse/SPARK-35133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324727#comment-17324727 ] Cheng Su commented on SPARK-35133: -- I am trying to come up with a clean solution to fix this. Will raise a PR soon. > EXPLAIN CODEGEN does not work with AQE > -- > > Key: SPARK-35133 > URL: https://issues.apache.org/jira/browse/SPARK-35133 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Major > > `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the > generated code for each stage of plan. The current implementation is to match > `WholeStageCodegenExec` operator in query plan and prints out generated code. > This does not work with AQE as we wrap the whole query plan inside > `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan > rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior > change for EXPLAIN query (and Dataset.explain), as we enable AQE by default > now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35133) EXPLAIN CODEGEN does not work with AQE
Cheng Su created SPARK-35133: Summary: EXPLAIN CODEGEN does not work with AQE Key: SPARK-35133 URL: https://issues.apache.org/jira/browse/SPARK-35133 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su `EXPLAIN CODEGEN ` (and Dataset.explain("codegen")) prints out the generated code for each stage of plan. The current implementation is to match `WholeStageCodegenExec` operator in query plan and prints out generated code. This does not work with AQE as we wrap the whole query plan inside `AdaptiveSparkPlanExec` and do not run whole stage code-gen physical plan rule eagerly (`CollapseCodegenStages`). This introduces unexpected behavior change for EXPLAIN query (and Dataset.explain), as we enable AQE by default now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35109) Fix minor exception messages of HashedRelation and HashJoin
[ https://issues.apache.org/jira/browse/SPARK-35109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Su updated SPARK-35109: - Summary: Fix minor exception messages of HashedRelation and HashJoin (was: Fix minor exception messages of HashedRelation and HashedJoin) > Fix minor exception messages of HashedRelation and HashJoin > --- > > Key: SPARK-35109 > URL: https://issues.apache.org/jira/browse/SPARK-35109 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Trivial > > It seems that we miss classifying one `SparkOutOfMemoryError` in > `HashedRelation`. Add the error classification for it. In addition, clean up > two errors definition of `HashJoin` as they are not used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35109) Fix minor exception messages of HashedRelation and HashedJoin
Cheng Su created SPARK-35109: Summary: Fix minor exception messages of HashedRelation and HashedJoin Key: SPARK-35109 URL: https://issues.apache.org/jira/browse/SPARK-35109 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Cheng Su It seems that we miss classifying one `SparkOutOfMemoryError` in `HashedRelation`. Add the error classification for it. In addition, clean up two errors definition of `HashJoin` as they are not used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32634) Introduce sort-based fallback mechanism for shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-32634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316683#comment-17316683 ] Cheng Su commented on SPARK-32634: -- [~Thomas Liu] - Implement fallback mechanism for whole stage code-gen, we need to override `doProduce()` method as well. Similar case is sort-based fallback for hash aggregate (e.g. `doProduceWithKeys` - [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L679]). For progress, sorry that I was busy with something else, I will work on this soon in next weeks, and target date is Spark 3.2.0 release, thanks. > Introduce sort-based fallback mechanism for shuffled hash join > --- > > Key: SPARK-32634 > URL: https://issues.apache.org/jira/browse/SPARK-32634 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Cheng Su >Priority: Minor > > A major pain point for spark users to stay away from using shuffled hash join > is out of memory issue. Shuffled hash join tends to have OOM issue because it > allocates in-memory hashed relation (`UnsafeHashedRelation` or > `LongHashedRelation`) for build side, and there's no recovery (e.g. > fallback/spill) once the size of hashed relation grows and cannot fit in > memory. On the other hand, shuffled hash join is more CPU and IO efficient > than sort merge join when joining one large table and a small table (but > small table is too large to be broadcasted), as SHJ does not sort the large > table, but SMJ needs to do that. > To improve the reliability of shuffled hash join, a fallback mechanism can be > introduced to avoid shuffled hash join OOM issue completely. Similarly we > already have a fallback to sort-based aggregation for hash aggregate. The > idea is: > (1).Build hashed relation as current, but monitor the hashed relation size > when inserting each build side row. If size of hashed relation being always > smaller than a configurable threshold, go to (2.1), else go to (2.2). > (2.1).Current shuffled hash join logic: reading stream side rows and probing > hashed relation. > (2.2).Fall back to sort merge join: Sort stream side rows, and sort build > side rows (iterate rows already in hashed relation (e.g. through > `BytesToBytesMap.destructiveIterator`), then iterate rest of un-read build > side rows). Then doing sort merge join for stream + build side rows. > > Note: > (1).the fallback is dynamic and happened per task, which means task 0 can > incur the fallback e.g. if it has a big build side, but task 1,2 don't need > to incur the fallback depending on the size of hashed relation. > (2).there's no major code change for SHJ and SMJ. Major change is around > HashedRelation to introduce some new methods, e.g. > `HashedRelation.destructiveValues()` to return an Iterator of build side rows > in hashed relation and cleaning up hashed relation along the way. > (3).we have run this feature by default in our internal fork more than 2 > years, and we benefit a lot from it with users can choose to use SHJ, and we > don't need to worry about SHJ reliability (see > https://issues.apache.org/jira/browse/SPARK-21505 for the original proposal > from our side, I tweak here to make it less intrusive and more acceptable, > e.g. not introducing a separate join operator, but doing the fallback > automatically inside SHJ operator itself). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34960) Aggregate (Min/Max/Count) push down for ORC
[ https://issues.apache.org/jira/browse/SPARK-34960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315070#comment-17315070 ] Cheng Su commented on SPARK-34960: -- Just FYI we will start sending out code after [https://github.com/apache/spark/pull/32049] is merged. cc [~huaxingao], thanks. > Aggregate (Min/Max/Count) push down for ORC > --- > > Key: SPARK-34960 > URL: https://issues.apache.org/jira/browse/SPARK-34960 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Cheng Su >Priority: Minor > > Similar to Parquet (https://issues.apache.org/jira/browse/SPARK-34952), we > can also push down certain aggregations into ORC. ORC exposes column > statistics in interface `org.apache.orc.Reader` > ([https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/Reader.java#L118] > ), where Spark can utilize for aggregation push down. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org