This is an automated email from the ASF dual-hosted git repository. dzamo pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/drill-site.git
The following commit(s) were added to refs/heads/master by this push: new 84663aebc Document PIVOT operators. 84663aebc is described below commit 84663aebc5ed2ca680ed93ba2353510dc7165a35 Author: James Turton <ja...@somecomputer.xyz> AuthorDate: Wed Feb 22 16:26:24 2023 +0200 Document PIVOT operators. --- .../040-storage-plugin-auth-modes.md | 10 +- .../sql-commands/087-set-operators.md | 2 +- .../sql-commands/130-pivot-operators.md | 140 +++++++++++++++++++++ 3 files changed, 146 insertions(+), 6 deletions(-) diff --git a/_docs/en/connect-a-data-source/040-storage-plugin-auth-modes.md b/_docs/en/connect-a-data-source/040-storage-plugin-auth-modes.md index 823fa59ba..a89515739 100644 --- a/_docs/en/connect-a-data-source/040-storage-plugin-auth-modes.md +++ b/_docs/en/connect-a-data-source/040-storage-plugin-auth-modes.md @@ -8,10 +8,10 @@ parent: "Storage Plugin Configuration" Drill 1.21 brings with it the ability to configure storage authentication modes on a per-plugin basis. Three authentication modes are provided but note that not every plugin need support every mode. Consult the respective plugin documentation page for information about the authentication modes that it supports. ## SHARED_USER -This is the default authentication mode for storage plugins and matches the authentication behaviour of storage plugins in previous versions of Drill. Drill connects to the storage using a single set of shared credentials stored in some credential provider. If no credentials are present, the plugin may connect with no credentials or make implicit use of the Drillbit's identity (e.g. OS process user). Authentication to the storage is unaffected by the Drill query user's identity. +This is the default authentication mode for storage plugins and matches the authentication behaviour of storage plugins in previous versions of Drill. Drill connects to the storage using a single set of shared credentials stored in a credential provider. If no credentials are present, the plugin may connect with no credentials or make implicit use of the Drillbit's identity (e.g. OS process user). Authentication to the storage is unaffected by the Drill query user's identity. ## USER_TRANSLATION -{% include startnote.html %}At the present time, to use the USER_TRANSLATION authentication mode the global option `drill.exec.impersonation` must be set to true.{% include endnote.html %} +{% include startnote.html %}At the present time, to use the USER_TRANSLATION authentication mode the global option drill.exec.impersonation must be set to true.{% include endnote.html %} Drill connects to the storage using credentials looked up ("translated") for the Drill query user. Authentication to the storage is a function of the Drill query user's identity (and that function may be 1-1 or *-1). @@ -19,7 +19,7 @@ Drill connects to the storage using credentials looked up ("translated") for the This authentication mode is not yet implemented but is planned to replace the global option `drill.exec.impersonation`. ## Syntax -The authentication mode for a storage plugin is specified in its storage configuration usign the `authMode` property. +The authentication mode for a storage plugin is specified in its storage configuration using the `authMode` property. ``` "authMode" : "SHARED_USER" | "USER_TRANSLATION" | "USER_IMPERSONATION" ``` @@ -28,7 +28,7 @@ The authentication mode for a storage plugin is specified in its storage configu Every credential provider continues to support the default SHARED_USER mode in the same way that they did for previous versions of Drill. At the time of writing, the two credential providers that support USER_TRANSLATION are 1. the Plain credentials provider which stores a table of credentials alongside other storage configuration (with credentials configurable in the Drill web UI) -2. the Hashicorp Vault credentials provider which stores credentials at paths that can be looked up dynamically in Vault. +2. the HashiCorp Vault credentials provider which stores credentials at paths that can be addressed dynamically by having Drill substitute in the query user's username. ## SHARED_USER mode examples @@ -83,7 +83,7 @@ Every credential provider continues to support the default SHARED_USER mode in t ## USER_TRANSLATION Example -### Using the Vault crendentials provider +### Using the Vault credentials provider ```json { "type": "jdbc", diff --git a/_docs/en/sql-reference/sql-commands/087-set-operators.md b/_docs/en/sql-reference/sql-commands/087-set-operators.md index 8907101e4..3dd251dca 100644 --- a/_docs/en/sql-reference/sql-commands/087-set-operators.md +++ b/_docs/en/sql-reference/sql-commands/087-set-operators.md @@ -50,7 +50,7 @@ Drill treats the empty directory as a schemaless table and returns results as if # INTERSECT and EXCEPT **Introduced in release: 1.21** -The INTERSECT and EXCEPT operators respectively compute the set intersection and the set difference of their two arguments. As with the UNION operator, duplicate records are auotmatically removed from the result set. +The INTERSECT and EXCEPT operators respectively compute the set intersection and the set difference of their two arguments. As with the UNION operator, duplicate records are automatically removed from the result set. ## Syntax diff --git a/_docs/en/sql-reference/sql-commands/130-pivot-operators.md b/_docs/en/sql-reference/sql-commands/130-pivot-operators.md new file mode 100644 index 000000000..3166963c2 --- /dev/null +++ b/_docs/en/sql-reference/sql-commands/130-pivot-operators.md @@ -0,0 +1,140 @@ +--- +title: "Pivot Operators" +slug: "Pivot Operators" +parent: "SQL Commands" +--- + +**Introduced in release: 1.21** + +{% include startnote.html %} +Bug DRILL-8403 in version 1.21.0 of Drill means that it does not correctly handle some aggregate functions including AVG, STDDEV and VARIANCE when they are used with the PIVOT operator. Such queries will succeed but return incorrect results that ignore the groups that have been pivoted to columns. Please upgrade to Drill 1.21.1. +{% include endnote.html %} + +Place PIVOT and UNPIVOT relational operators beneath the table references in your query to respectively pivot rows up to columns or unpivot columns down to rows. + +# PIVOT + +The PIVOT operator will generate columns based on one more aggregate functions and one or more pivot expressions derived from the values occurring in one or more pivot columns. Groups and filtered aggregates are generated automatically making the PIVOT operator a more concise representation of queries that must otherwise be written using conditional expressions inside aggregate function calls and GROUP BY. + +## Syntax +``` +pivot: + PIVOT '(' + pivotAgg [, pivotAgg ]* + FOR pivotList + IN '(' pivotExpr [, pivotExpr ]* ')' + ')' + +pivotAgg: + agg '(' [ ALL | DISTINCT ] value [, value ]* ')' + [ [ AS ] alias ] + +pivotList: + columnOrList + +pivotExpr: + exprOrList [ [ AS ] alias ]``` +``` + +## Examples + +Use PIVOT to generate columns for two marital statuses × two metrics leaving education_level in rows. + +```sql +SELECT + * +FROM + (SELECT + employee_id, + education_level, + salary, + marital_status + FROM + cp.`employee.json`) +PIVOT ( + count(employee_id) employee_count, sum(salary) total_remun FOR marital_status IN ('M' married, 'S' single) +) +``` +``` ++---------------------+------------------------+---------------------+-----------------------+--------------------+ +| education_level | married_employee_count | married_total_remun | single_employee_count | single_total_remun | ++---------------------+------------------------+---------------------+-----------------------+--------------------+ +| Graduate Degree | 85 | 343270.0 | 85 | 403510.0 | +| Bachelors Degree | 144 | 689640.0 | 143 | 599680.0 | +| Partial College | 152 | 650770.0 | 136 | 514800.0 | +| High School Degree | 139 | 480840.0 | 142 | 507200.0 | +| Partial High School | 62 | 220460.0 | 67 | 232470.0 | ++---------------------+------------------------+---------------------+-----------------------+--------------------+ +5 rows selected (0.445 seconds) +``` + +# UNPIVOT + +The UNPIVOT operator will generate new rows in the place of one or more columns, moving their values to a new "unpivot measure" column. Unlike the PIVOT operator, no cardinality changes like grouping take place so that the number of values in the result is unchanged, only the axes along which they are laid out. + +## Syntax +``` +UNPIVOT [ INCLUDING NULLS | EXCLUDING NULLS ] '(' + unpivotMeasureList + FOR unpivotAxisList + IN '(' unpivotValue [, unpivotValue ]* ')' + ')' + +unpivotMeasureList: + columnOrList + +unpivotAxisList: + columnOrList + +unpivotValue: + column [ AS literal ] + | '(' column [, column ]* ')' [ AS '(' literal [, literal ]* ')' ] +``` + +## Examples +Use UNPIVOT to generate rows for 5 dimensions × 3 observations. + +```sql +WITH wide_form as ( + SELECT + random() dim1, + random() dim2, + random() dim3, + random() dim4, + random() dim5 + FROM cp.`employee.json` + LIMIT 3 +) +SELECT + * +FROM + wide_form +UNPIVOT ( + metric FOR dimension IN (dim1, dim2, dim3, dim4, dim5) +) as long_form +ORDER BY dimension; +``` + +``` ++-----------+---------------------+ +| dimension | metric | ++-----------+---------------------+ +| dim1 | 0.2949968170510818 | +| dim1 | 0.08013928181408925 | +| dim1 | 0.7666829294454385 | +| dim2 | 0.12904903586688676 | +| dim2 | 0.2596097757126131 | +| dim2 | 0.8664893860232098 | +| dim3 | 0.1849098946061125 | +| dim3 | 0.8035574424732861 | +| dim3 | 0.633143190894335 | +| dim4 | 0.29900950161735684 | +| dim4 | 0.6081277181773982 | +| dim4 | 0.10303324887132925 | +| dim5 | 0.39592775091118426 | +| dim5 | 0.15143900714797243 | +| dim5 | 0.6540326371582511 | ++-----------+---------------------+ +15 rows selected (0.333 seconds) +``` +