Ioana Delaney created SPARK-23757: ------------------------------------- Summary: [Performance] Star schema detection improvements Key: SPARK-23757 URL: https://issues.apache.org/jira/browse/SPARK-23757 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0 Reporter: Ioana Delaney
Star schema consists of one or more fact tables referencing a number of dimension tables. Queries against star schema are expected to run fast because of the established RI constraints among the tables. In general, star schema joins are detected using the following conditions: 1. RI constraints (reliable detection) * Dimension contains a primary key that is being joined to the fact table. * Fact table contains foreign keys referencing multiple dimension tables. 2. Cardinality based heuristics * Usually, the table with the highest cardinality is the fact table. Existing SPARK-17791 uses a combination of the above two conditions to detect and optimize star joins. With support for informational RI constraints, the algorithm in SPARK-17791 can be improved with reliable RI detection. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org