This is an automated email from the ASF dual-hosted git repository. cgivre pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/drill-site.git
The following commit(s) were added to refs/heads/master by this push: new 1f751423a Create 126-delta-lake-format-plugin.md 1f751423a is described below commit 1f751423a17c517b4f534106476fdc085b7591a3 Author: Charles S. Givre <cgi...@apache.org> AuthorDate: Sun Feb 26 11:00:25 2023 -0500 Create 126-delta-lake-format-plugin.md --- .../126-delta-lake-format-plugin.md | 61 ++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/_docs/en/data-sources-and-file-formats/126-delta-lake-format-plugin.md b/_docs/en/data-sources-and-file-formats/126-delta-lake-format-plugin.md new file mode 100644 index 000000000..171f1f373 --- /dev/null +++ b/_docs/en/data-sources-and-file-formats/126-delta-lake-format-plugin.md @@ -0,0 +1,61 @@ +--- +title: "Delta Lake Format Plugin" +slug: "Delta Lake Format Plugin" +parent: "Data Sources and File Formats" +--- + +**Introduced in release:** 1.21 + +This format plugin enables Drill to query Delta Lake tables. + +## Supported optimizations and features + +### Project pushdown + +This format plugin supports project and filter pushdown optimizations. + +For the case of project pushdown, only columns specified in the query will be read, even when they are nested columns. + +### Filter pushdown + +For the case of filter pushdown, all expressions supported by Delta Lake API will be pushed down, so only data that +matches the filter expression will be read. Additionally, filtering logic for parquet files is enabled +to allow pruning of parquet files that do not match the filter expression. + +### Querying specific table versions (snapshots) + +Delta Lake has the ability to travel back in time to the specific data version. + +The following ways of specifying data version are supported: + +- `version` - the version number of the specific snapshot +- `timestamp` - the timestamp in milliseconds at or before which the specific snapshot was generated + +Table function can be used to specify one of the above configs in the following way: + +```sql +SELECT * +FROM table(dfs.tmp.testAllTypes(type => 'delta', version => 0)); + +SELECT * +FROM table(dfs.tmp.testAllTypes(type => 'delta', timestamp => 1636231332000)); +``` + +## Configuration + +The format plugin has the following configuration options: + +- `type` - format plugin type, should be `'delta'` + +### Format config example: + +```json +{ + "type": "file", + "formats": { + "delta": { + "type": "delta" + } + } +} +```