vvysotskyi commented on a change in pull request #1986: Additional changes for
Drill Metastore docs
URL: https://github.com/apache/drill/pull/1986#discussion_r387065418
##########
File path:
_docs/performance-tuning/drill-metastore/010-using-drill-metastore.md
##########
@@ -48,42 +83,52 @@ drill.metastore: {
}
```
-Note, that currently out of box Iceberg Metastore is available and is the
default one. Though any custom
- implementation can be added by placing the JAR into classpath which has the
implementation of
- `org.apache.drill.metastore.Metastore` interface and indicating custom class
in the `drill.metastore.implementation.class`.
-
### Metastore Components
-Metastore can store metadata for various components: tables, views, etc.
-Current implementation provides fully functioning support for tables component.
-Views component support is not implemented but contains stub methods to show
-how new Metastore components like UDFs, storage plugins, etc. can be added in
the future.
+The Drill 1.17 version of the Metastore stores metadata about tables: the
table schema and table statistics.
+The Metastore is an active subproject of Drill, See
[DRILL-6552](https://issues.apache.org/jira/browse/DRILL-6552) for more
information.
+
+### Table Metadata
+
+Table Metadata includes the following info:
+
+ - Table schema, column name, type, nullability, scale and precision if
available, and other info. For details please
+ refer to [Schema
provisioning]({{site.baseurl}}/docs/create-or-replace-schema/#usage-notes).
+ - Table statistics. This itself has two categories:
+ - Summary statistics: `MIN`, `MAX`, `NULL count`, etc.
+ - Detail statistics: histograms, `NDV`, etc.
-### Metastore Tables
+Schema information and summary statistics also computed and stored for table
segments, files, row groups, partitions.
-Metastore Tables component contains metadata about Drill tables, including
general information, as well as
-information about table segments, files, row groups, partitions.
+The detailed metadata schema is described
[here](https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables).
+You can try out the metadata, and get a sense of what is available, by using
the
+ [Inspect the Metastore using `INFORMATION_SCHEMA`
tables]({{site.baseurl}}/docs/using-drill-metastore/#inspect-the-metastore-using-information_schema-tables)
tutorial.
-Full table metadata consists of two major concepts: general information and
top-level segments metadata.
-Table general information contains basic table information and corresponds to
the `BaseTableMetadata` class.
+Every table described by the Metastore may be a bare file or one or more files
that reside in one or more directories.
-A table can be non-partitioned and partitioned. Non-partitioned tables have
only one top-level segment
-which is called default (`MetadataInfo#DEFAULT_SEGMENT_KEY`). Partitioned
tables may have several top-level segments.
-Each top-level segment can include metadata about inner segments, files, row
groups, and partitions.
+If a table consists of a single directory or file, then it is non-partitioned.
The single directory can contain any number of files.
+Larger tables tend to have subdirectories. Each subdirectory is a partition
and such a table are called "partitioned".
+Please refer to [Exposing Drill Metastore metadata through
`INFORMATION_SCHEMA`
tables]({{site.baseurl}}/docs/using-drill-metastore/#exposing-drill-metastore-metadata-through-information_schema-tables)
+ for information, how to query partitions and segments metadata.
-A unique table identifier in Metastore Tables is a combination of storage
plugin, workspace, and table name.
-Table metadata inside is grouped by top-level segments, unique identifier of
the top-level segment and its metadata
-is storage plugin, workspace, table name, and metadata key.
+A traditional database divides tables into schemas and tables.
+Drill can connect to any number of data sources, each of which may have its
own schema.
+As a result, the Metastore labels tables with a combination of (plugin
configuration name, workspace name, table name).
+Note that if before renaming any of these items, you must delete table's
Metadata entry and recreate it after renaming.
### Related Session/System Options
-The following options are set via `ALTER SYSTEM SET`, or `ALTER SESSION SET`
or via the Drill Web console.
+The metastore provides a number of options to fit your environment. The
default options are find in most cases.
+The options are set via `ALTER SYSTEM SET`, `ALTER SESSION SET` or the Drill
Web console.
Review comment:
Thanks, added clarification.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services