Konstantin Orlov created IGNITE-25366:
-----------------------------------------
Summary: Documentation. Describe output of EXPLAIN command
Key: IGNITE-25366
URL: https://issues.apache.org/jira/browse/IGNITE-25366
Project: Ignite
Issue Type: Improvement
Components: documentation ai3
Reporter: Konstantin Orlov
Let's update documentation with page describing general concept of EXPLAIN
output, which is provided below.
Every relational operator is described with a name and set of attributes:
{code}
NameOfTheRelationalOperator
attribute1: value1
attribute2: value2
{code}
Let’s take a look at few examples:
{code}
TableScan // Full table access
table: PUBLIC.EMP1 // Name of the table in question
fields: [NAME, SALARY] // List of columns to return
est: (rows=1) // Estimated number of rows returned
IndexScan // Index scan
table: PUBLIC.TEST_TBL // Name of the table in question
index: IDX_DESC // Name of the index in question
type: SORTED // Type of the index
fields: [C1] // List of columns to return
collation: [C1 DESC] // Collation of the index aka order of sorting
est: (rows=1) // Estimated number of rows returned
Sort
collation: [C1 DESC NULLS LAST] // Collation to sort input rows
est: (rows=1) // Estimated number of rows returned
{code}
Name represents a particular algorithm used to execute relation operators
({{TableScan}} vs {{IndexScan}}, {{HashJoin}} vs {{MergeJoin}} vs
{{NestedLoopJoin}}, etc). The set of attributes depends on the particular
relational operator.
A query plan is represented by a tree-like structure which is the composition
of nodes described above. This tree describes a data flow, where rows are
passed from leaves to a plan root node (root node is the topmost node, it also
has no indentation). Let's take a look at few examples:
{code}
// simple ordered select where desired order matches collation of existing index
EXPLAIN PLAN FOR SELECT c1 FROM test_tbl ORDER BY c1 DESC NULLS FIRST
Exchange
distribution: single
est: (rows=1)
IndexScan
table: PUBLIC.TEST_TBL
index: IDX_DESC
type: SORTED
fields: [C1]
collation: [C1 DESC]
est: (rows=1)
// similar query, but desired order doesn't match index collation. Mind the
additional
// SORT node.
EXPLAIN PLAN FOR SELECT c1 FROM test_tbl ORDER BY c1 DESC NULLS LAST
Exchange
distribution: single
est: (rows=1)
Sort
collation: [C1 DESC NULLS LAST]
est: (rows=1)
TableScan
table: PUBLIC.TEST_TBL
fields: [C1]
est: (rows=1)
// Mind the "fetch" attribute of Sort node. It denotes TopN sort algorithm,
// implying that only N nodes will be kept in memory. This also implies
// that only N rows from every node will be transferred over Exchange.
SELECT * FROM test ORDER BY pk FETCH FIRST ? ROWS ONLY
Limit
fetch: ?0
est: (rows=1)
Exchange
distribution: single
est: (rows=1)
Sort
collation: [PK ASC]
fetch: ?0
est: (rows=1)
TableScan
table: PUBLIC.TEST
fields: [PK, COL0]
est: (rows=1)
// Similar query, but Limit node wasn't pushed down the exchange. This implies
// that the whole dataset will be transferred over an Exchange.
SELECT * FROM test OFFSET ? ROWS FETCH FIRST ? ROWS ONLY
Limit
offset: ?0
fetch: ?1
est: (rows=1)
Exchange
distribution: single
est: (rows=1)
TableScan
table: PUBLIC.TEST
fields: [PK, COL0]
est: (rows=1)
// More complex plan
EXPLAIN PLAN FOR SELECT
U.UserName, P.ProductName, R.ReviewText, R.Rating
FROM Users U, Reviews R, Products P
WHERE U.UserID = R.UserID
AND R.ProductID = P.ProductID
AND P.ProductName = 'Product_' || ?::varchar;
Project
fields: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING]
exprs: [USERNAME, PRODUCTNAME, REVIEWTEXT, RATING]
est: (rows=16650)
HashJoin
condition: =(USERID0, USERID)
joinType: inner
est: (rows=16650)
HashJoin
condition: =(PRODUCTID, PRODUCTID0)
joinType: inner
est: (rows=16650)
Exchange
distribution: single
est: (rows=50000)
TableScan
table: PUBLIC.REVIEWS
fields: [PRODUCTID, USERID, REVIEWTEXT, RATING]
est: (rows=50000)
Exchange
distribution: single
est: (rows=1)665
TableScan
table: PUBLIC.PRODUCTS
filters: =(PRODUCTNAME, ||(_UTF-8'Product_', CAST(?0):VARCHAR
CHARACTER SET "UTF-8"))
fields: [PRODUCTID, PRODUCTNAME]
est: (rows=1665)
Exchange
distribution: single
est: (rows=10000)
TableScan
table: PUBLIC.USERS
fields: [USERID, USERNAME]
est: (rows=10000)
{code}
We also should provide exhaustive list of relational operators their attributes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)