writer-jill commented on code in PR #19403: URL: https://github.com/apache/druid/pull/19403#discussion_r3228533793
########## docs/ingestion/schema-design.md: ########## @@ -249,12 +249,13 @@ Druid can infer the schema for your data in one of two ways: #### Type-aware schema discovery -:::info - Note that using type-aware schema discovery can impact downstream BI tools depending on how they handle ARRAY typed columns. -::: - You can have Druid infer the schema and types for your data partially or fully by setting `dimensionsSpec.useSchemaDiscovery` to `true` and defining some or no dimensions in the dimensions list. +Before you use type-aware schema discovery, keep the following in mind: + +- There maybe impact on downstream BI tools depending on how they handle ARRAY-typed columns. Review Comment: ```suggestion - There may be an impact on downstream BI tools depending on how they handle ARRAY-typed columns. ``` ########## docs/ingestion/ingestion-spec.md: ########## @@ -188,9 +188,13 @@ Treat `__time` as a millisecond timestamp: the number of milliseconds since Jan The `dimensionsSpec` is located in `dataSchema` → `dimensionsSpec` and is responsible for configuring [dimensions](./schema-model.md#dimensions). -You can either manually specify the dimensions or take advantage of schema auto-discovery where you allow Druid to infer all or some of the schema for your data. This means that you don't have to explicitly specify your dimensions and their type. +You can either manually specify the dimensions or take advantage of type-aware schema auto-discovery where you allow Druid to infer all or some of the schema for your data. This means that you don't have to explicitly specify your dimensions and their type. -To use schema auto-discovery, set `useSchemaDiscovery` to `true`. +:::caution +When using type-aware schema auto-discovery, Druid discovers the type for all dimensions unless you explicitly specify dimensions for Druid to ignore by using the `dimensionExclusions` field. This helps you control storage costs by preventing Druid from ingesting dimensions unintentionally. Review Comment: ```suggestion When using type-aware schema auto-discovery, Druid discovers the type for all dimensions unless you use the `dimensionExclusions` field to explicitly specify dimensions to ignore. This helps you control storage costs by preventing Druid from unintentionally ingesting dimensions. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
