techdocsmith commented on code in PR #19403: URL: https://github.com/apache/druid/pull/19403#discussion_r3237642540
########## docs/ingestion/schema-design.md: ########## @@ -249,12 +249,13 @@ Druid can infer the schema for your data in one of two ways: #### Type-aware schema discovery -:::info - Note that using type-aware schema discovery can impact downstream BI tools depending on how they handle ARRAY typed columns. -::: - You can have Druid infer the schema and types for your data partially or fully by setting `dimensionsSpec.useSchemaDiscovery` to `true` and defining some or no dimensions in the dimensions list. +Before you use type-aware schema discovery, keep the following in mind: + +- There may be an impact on downstream BI tools depending on how they handle ARRAY-typed columns. +- Be aware of all the potential dimensions. Druid discovers all the dimensions unless you specify an exclusion list. Without the list, you may ingest more columns than you intend. For example, if you use type-aware schema discovery and the Kafka input format, you'll ingest optional dimensions like the Kafka offset and partition unless you add them to the exclusion list. Review Comment: ```suggestion - Be aware of all the potential dimensions. Druid discovers all available dimensions unless you specify an exclusion list. Without an exclusion list, you may ingest more columns than you intend. For example, if you use type-aware schema discovery and the Kafka input format, Druid discovers dimensions like the Kafka offset and partition unless you add them to the exclusion list. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
