Re: [PR] docs: 32.0.0 release notes (druid)

via GitHub Wed, 29 Jan 2025 21:11:07 -0800


kfaraz commented on code in PR #17641:
URL: https://github.com/apache/druid/pull/17641#discussion_r1935009145



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:

Review Comment:
   We should also call out that the corresponding coordinator APIs are now 
deprecated and will be removed in a future release, and that the coordinator 
now calls the overlord to serve these requests.
   
   The original PR has a list of the deprecated APIs.



##########
docs/release-info/release-notes.md:
##########
@@ -114,6 +358,15 @@ If you're already using this feature, you don't need to 
take any action.
 
 ### Developer notes
 
+- Improved dependency support between extensions. When an extension has a 
dependency on another extension, it now tries to use the dependency's class 
loader to find classes required classes 
[#16973](https://github.com/apache/druid/pull/16973)

Review Comment:
   We should also add a point for deprecated coordinator APIs.



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:
+
+- Mark all segments of a datasource as unused:
+`POST /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark all (non-overshadowed) segments of a datasource as used:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark multiple segments as used
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed`
+- Mark multiple (non-overshadowed) segments as unused
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused`
+
+- Mark a single segment as used:
+`POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+- Mark a single segment as unused:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+[#17545](https://github.com/apache/druid/pull/17545)
+
+
+### 17386
+
+https://github.com/apache/druid/pull/17386
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Explore view (experimental)
+
+Several improvements have been made to the Explore view in the web console:
+
+The time chart visualization now supports zooming, dragging, and is smarter 
about granularity detection:
+![](./explore_timechart.png)
+
+
+Filters been improved with helper tables and additional context:
+![](./explore_filtering.png)
+
+Tiles can now be shown side-by-side:
+![](./explore_tiles.png)
+
+[#17627](https://github.com/apache/druid/pull/17627)
+
+#### Segment timeline view
+
+The segment timeline is now more interactive and no longer forces day 
granularity.
+
+**New view**
+![](./webconsole_segment_timeline2.png)
+
+
+**Old view:**
+![](./webconsole_segmenttimeline1.png)
+
+[#17521](https://github.com/apache/druid/pull/17521)
+
 #### Other web console improvements
 
+- The timezoner picker now always shows your timezone 
[#17521](https://github.com/apache/druid/pull/17521)
+- UNNEST is now supported for autocomplete suggestions 
[#17521](https://github.com/apache/druid/pull/17521)
+- Tables now support less than and greater than filters 
[#17521](https://github.com/apache/druid/pull/17521)
+- You can now resize the side panels in the Query view 
[#17387](https://github.com/apache/druid/pull/17387)
+- Added the `expectedLoadTimeMillis` segment loading metric to the web console 
[#17359](https://github.com/apache/druid/pull/17359)
+
 ### Ingestion
 
+#### Numbers for CSV and TSV input formats
+
+Use the new optional config `tryParseNumbers` for CSV and TSV input formats to 
control how numbers are treated. If enabled, any numbers present in the input 
will be parsed in the following manner:
+
+- long data type for integer types and 
+- double for floating-point numbers
+ 
+By default, this configuration is set to false, so numeric strings will be 
treated as strings.
+
+[#17082](https://github.com/apache/druid/pull/17082)
+
+#### Other ingestion improvements
+
+- Reduce the direct memory requirement on non-query processing tasks by not 
reserving query buffers for them 
[#16887](https://github.com/apache/druid/pull/16887)
+- JSON-based and SQL-based ingestion now support request headers when using an 
HTTP input source [#16974](https://github.com/apache/druid/pull/16974)
+ 
 #### SQL-based ingestion
 
 ##### Other SQL-based ingestion improvements
 
+- SQL-based ingestion now supports dynamic parameters for queries besides 
SELECT queries, such as REPLACE 
[#17126](https://github.com/apache/druid/pull/17126)
+- Improved thread names to include the stage ID and worker number to help with 
troubleshooting [#17324](https://github.com/apache/druid/pull/17324)
+
 #### Streaming ingestion
 
+##### Control how many segments get merged for publishing
+
+You can now use the `maxColumsnToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.

Review Comment:
   ```suggestion
   You can now use the `maxColumnsToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.
   ```



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:
+
+- Mark all segments of a datasource as unused:
+`POST /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark all (non-overshadowed) segments of a datasource as used:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark multiple segments as used
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed`
+- Mark multiple (non-overshadowed) segments as unused
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused`
+
+- Mark a single segment as used:
+`POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+- Mark a single segment as unused:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+[#17545](https://github.com/apache/druid/pull/17545)
+
+
+### 17386
+
+https://github.com/apache/druid/pull/17386
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Explore view (experimental)
+
+Several improvements have been made to the Explore view in the web console:
+
+The time chart visualization now supports zooming, dragging, and is smarter 
about granularity detection:
+![](./explore_timechart.png)
+
+
+Filters been improved with helper tables and additional context:
+![](./explore_filtering.png)
+
+Tiles can now be shown side-by-side:
+![](./explore_tiles.png)
+
+[#17627](https://github.com/apache/druid/pull/17627)
+
+#### Segment timeline view
+
+The segment timeline is now more interactive and no longer forces day 
granularity.
+
+**New view**
+![](./webconsole_segment_timeline2.png)
+
+
+**Old view:**
+![](./webconsole_segmenttimeline1.png)
+
+[#17521](https://github.com/apache/druid/pull/17521)
+
 #### Other web console improvements
 
+- The timezoner picker now always shows your timezone 
[#17521](https://github.com/apache/druid/pull/17521)
+- UNNEST is now supported for autocomplete suggestions 
[#17521](https://github.com/apache/druid/pull/17521)
+- Tables now support less than and greater than filters 
[#17521](https://github.com/apache/druid/pull/17521)
+- You can now resize the side panels in the Query view 
[#17387](https://github.com/apache/druid/pull/17387)
+- Added the `expectedLoadTimeMillis` segment loading metric to the web console 
[#17359](https://github.com/apache/druid/pull/17359)
+
 ### Ingestion
 
+#### Numbers for CSV and TSV input formats
+
+Use the new optional config `tryParseNumbers` for CSV and TSV input formats to 
control how numbers are treated. If enabled, any numbers present in the input 
will be parsed in the following manner:
+
+- long data type for integer types and 
+- double for floating-point numbers
+ 
+By default, this configuration is set to false, so numeric strings will be 
treated as strings.
+
+[#17082](https://github.com/apache/druid/pull/17082)
+
+#### Other ingestion improvements
+
+- Reduce the direct memory requirement on non-query processing tasks by not 
reserving query buffers for them 
[#16887](https://github.com/apache/druid/pull/16887)
+- JSON-based and SQL-based ingestion now support request headers when using an 
HTTP input source [#16974](https://github.com/apache/druid/pull/16974)
+ 
 #### SQL-based ingestion
 
 ##### Other SQL-based ingestion improvements
 
+- SQL-based ingestion now supports dynamic parameters for queries besides 
SELECT queries, such as REPLACE 
[#17126](https://github.com/apache/druid/pull/17126)
+- Improved thread names to include the stage ID and worker number to help with 
troubleshooting [#17324](https://github.com/apache/druid/pull/17324)
+
 #### Streaming ingestion
 
+##### Control how many segments get merged for publishing
+
+You can now use the `maxColumsnToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.
+
+[#17030](https://github.com/apache/druid/pull/17030)
+
 ##### Other streaming ingestion improvements
 
+- Druid now properly supports early/late rejection periods when 
`stopTasksCount` is configured and streaming tasks run longer than the 
configured task duration [#17442](https://github.com/apache/druid/pull/17442)
+- Improved segment publishing when resubmitting supervisors or when task 
publishing takes a long time 
[#17509](https://github.com/apache/druid/pull/17509)
+
 ### Querying
 
+#### Window queries
+
+The following fields are deprecated for window queries that use the MSQ task 
engine: `maxRowsMaterializedInWindow` and `partitionColumnNames`. They will be 
removed in a future release.
+
+[#17433](https://github.com/apache/druid/pull/17433)
+
+
+

Review Comment:
   Missing content?



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:
+
+- Mark all segments of a datasource as unused:
+`POST /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark all (non-overshadowed) segments of a datasource as used:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark multiple segments as used
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed`
+- Mark multiple (non-overshadowed) segments as unused
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused`
+
+- Mark a single segment as used:
+`POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+- Mark a single segment as unused:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+[#17545](https://github.com/apache/druid/pull/17545)
+
+
+### 17386
+
+https://github.com/apache/druid/pull/17386
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Explore view (experimental)
+
+Several improvements have been made to the Explore view in the web console:
+
+The time chart visualization now supports zooming, dragging, and is smarter 
about granularity detection:
+![](./explore_timechart.png)
+
+
+Filters been improved with helper tables and additional context:
+![](./explore_filtering.png)
+
+Tiles can now be shown side-by-side:
+![](./explore_tiles.png)
+
+[#17627](https://github.com/apache/druid/pull/17627)
+
+#### Segment timeline view
+
+The segment timeline is now more interactive and no longer forces day 
granularity.
+
+**New view**
+![](./webconsole_segment_timeline2.png)
+
+
+**Old view:**
+![](./webconsole_segmenttimeline1.png)
+
+[#17521](https://github.com/apache/druid/pull/17521)
+
 #### Other web console improvements
 
+- The timezoner picker now always shows your timezone 
[#17521](https://github.com/apache/druid/pull/17521)
+- UNNEST is now supported for autocomplete suggestions 
[#17521](https://github.com/apache/druid/pull/17521)
+- Tables now support less than and greater than filters 
[#17521](https://github.com/apache/druid/pull/17521)
+- You can now resize the side panels in the Query view 
[#17387](https://github.com/apache/druid/pull/17387)
+- Added the `expectedLoadTimeMillis` segment loading metric to the web console 
[#17359](https://github.com/apache/druid/pull/17359)
+
 ### Ingestion
 
+#### Numbers for CSV and TSV input formats
+
+Use the new optional config `tryParseNumbers` for CSV and TSV input formats to 
control how numbers are treated. If enabled, any numbers present in the input 
will be parsed in the following manner:
+
+- long data type for integer types and 
+- double for floating-point numbers
+ 
+By default, this configuration is set to false, so numeric strings will be 
treated as strings.
+
+[#17082](https://github.com/apache/druid/pull/17082)
+
+#### Other ingestion improvements
+
+- Reduce the direct memory requirement on non-query processing tasks by not 
reserving query buffers for them 
[#16887](https://github.com/apache/druid/pull/16887)
+- JSON-based and SQL-based ingestion now support request headers when using an 
HTTP input source [#16974](https://github.com/apache/druid/pull/16974)
+ 
 #### SQL-based ingestion
 
 ##### Other SQL-based ingestion improvements
 
+- SQL-based ingestion now supports dynamic parameters for queries besides 
SELECT queries, such as REPLACE 
[#17126](https://github.com/apache/druid/pull/17126)
+- Improved thread names to include the stage ID and worker number to help with 
troubleshooting [#17324](https://github.com/apache/druid/pull/17324)
+
 #### Streaming ingestion
 
+##### Control how many segments get merged for publishing
+
+You can now use the `maxColumsnToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.
+
+[#17030](https://github.com/apache/druid/pull/17030)
+
 ##### Other streaming ingestion improvements
 
+- Druid now properly supports early/late rejection periods when 
`stopTasksCount` is configured and streaming tasks run longer than the 
configured task duration [#17442](https://github.com/apache/druid/pull/17442)

Review Comment:
   ```suggestion
   - Druid now fully supports early/late rejection periods when 
`stopTasksCount` is configured and streaming tasks run longer than the 
configured task duration [#17442](https://github.com/apache/druid/pull/17442)
   ```



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:
+
+- Mark all segments of a datasource as unused:
+`POST /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark all (non-overshadowed) segments of a datasource as used:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark multiple segments as used
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed`
+- Mark multiple (non-overshadowed) segments as unused
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused`
+
+- Mark a single segment as used:
+`POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+- Mark a single segment as unused:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+[#17545](https://github.com/apache/druid/pull/17545)
+
+
+### 17386
+
+https://github.com/apache/druid/pull/17386
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Explore view (experimental)
+
+Several improvements have been made to the Explore view in the web console:
+
+The time chart visualization now supports zooming, dragging, and is smarter 
about granularity detection:
+![](./explore_timechart.png)
+
+
+Filters been improved with helper tables and additional context:
+![](./explore_filtering.png)
+
+Tiles can now be shown side-by-side:
+![](./explore_tiles.png)
+
+[#17627](https://github.com/apache/druid/pull/17627)
+
+#### Segment timeline view
+
+The segment timeline is now more interactive and no longer forces day 
granularity.
+
+**New view**
+![](./webconsole_segment_timeline2.png)
+
+
+**Old view:**
+![](./webconsole_segmenttimeline1.png)
+
+[#17521](https://github.com/apache/druid/pull/17521)
+
 #### Other web console improvements
 
+- The timezoner picker now always shows your timezone 
[#17521](https://github.com/apache/druid/pull/17521)
+- UNNEST is now supported for autocomplete suggestions 
[#17521](https://github.com/apache/druid/pull/17521)
+- Tables now support less than and greater than filters 
[#17521](https://github.com/apache/druid/pull/17521)
+- You can now resize the side panels in the Query view 
[#17387](https://github.com/apache/druid/pull/17387)
+- Added the `expectedLoadTimeMillis` segment loading metric to the web console 
[#17359](https://github.com/apache/druid/pull/17359)
+
 ### Ingestion
 
+#### Numbers for CSV and TSV input formats
+
+Use the new optional config `tryParseNumbers` for CSV and TSV input formats to 
control how numbers are treated. If enabled, any numbers present in the input 
will be parsed in the following manner:
+
+- long data type for integer types and 
+- double for floating-point numbers
+ 
+By default, this configuration is set to false, so numeric strings will be 
treated as strings.
+
+[#17082](https://github.com/apache/druid/pull/17082)
+
+#### Other ingestion improvements
+
+- Reduce the direct memory requirement on non-query processing tasks by not 
reserving query buffers for them 
[#16887](https://github.com/apache/druid/pull/16887)
+- JSON-based and SQL-based ingestion now support request headers when using an 
HTTP input source [#16974](https://github.com/apache/druid/pull/16974)
+ 
 #### SQL-based ingestion
 
 ##### Other SQL-based ingestion improvements
 
+- SQL-based ingestion now supports dynamic parameters for queries besides 
SELECT queries, such as REPLACE 
[#17126](https://github.com/apache/druid/pull/17126)
+- Improved thread names to include the stage ID and worker number to help with 
troubleshooting [#17324](https://github.com/apache/druid/pull/17324)
+
 #### Streaming ingestion
 
+##### Control how many segments get merged for publishing
+
+You can now use the `maxColumsnToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.
+
+[#17030](https://github.com/apache/druid/pull/17030)
+
 ##### Other streaming ingestion improvements
 
+- Druid now properly supports early/late rejection periods when 
`stopTasksCount` is configured and streaming tasks run longer than the 
configured task duration [#17442](https://github.com/apache/druid/pull/17442)
+- Improved segment publishing when resubmitting supervisors or when task 
publishing takes a long time 
[#17509](https://github.com/apache/druid/pull/17509)
+
 ### Querying
 
+#### Window queries
+
+The following fields are deprecated for window queries that use the MSQ task 
engine: `maxRowsMaterializedInWindow` and `partitionColumnNames`. They will be 
removed in a future release.
+
+[#17433](https://github.com/apache/druid/pull/17433)
+
+
+
+
+[#17541](https://github.com/apache/druid/pull/17541)
+
 #### Other querying improvements
 
+- Added automatic query prioritization based on the period of the segments 
scanned in a query. You can set the duration threshold in ISO format using 
`druid.query.scheduler.prioritization.segmentRangeThreshold` 
[#17009](https://github.com/apache/druid/pull/17009)
+- Improved error handling for incomplete queries. A trailer header to indicate 
an error is returned now [#16672](https://github.com/apache/druid/pull/16672)
+- Improved scan queries to account for column types in more situations 
[#17463](https://github.com/apache/druid/pull/17463)
+- Improved lookups so that they can now iterate over fetched data 
[#17212](https://github.com/apache/druid/pull/17212)
+- Improved projections so that they can contain only aggregators and no 
grouping columns [#17484](https://github.com/apache/druid/pull/17484)
+- Removed microseconds as a supported unit for EXTRACT 
[#17247](https://github.com/apache/druid/pull/17247)
+
+
 ### Cluster management
 
+#### Reduced metadata IO

Review Comment:
   ```suggestion
   #### Reduce metadata IO in batch segment allocation
   ```



##########
docs/release-info/release-notes.md:
##########
@@ -57,40 +57,284 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### ANSI-SQL compatibility and query results
+
+Support for the configs that let you maintain older behavior that wasn't 
ANSI-SQL compliant have been removed:
+
+- `druid.generic.useDefaultValueForNull=true`
+- `druid.expressions.useStrictBooleans=false`
+- `druid.generic.useThreeValueLogicForNativeFilters=false` 
+
+They no longer affect your query results. Only SQL-compliant non-legacy 
behavior is supported now. 
+
+If the configs are set to the legacy behavior, Druid services will fail to 
start. 
+
+If you want to continue to get the same results without these settings, you 
must update your queries or your results will be incorrect after you upgrade.
+
+For more information about how to update your queries, see the [migration 
guide](https://druid.apache.org/docs/latest/release-info/migr-ansi-sql-null).
+
+[#17568](https://github.com/apache/druid/pull/17568) 
[#17609](https://github.com/apache/druid/pull/17609)
+
+### Java support
+
+Java support in Druid has been updated:
+
+- Java 8 support has been removed
+- Java 11 support is deprecated
+
+We recommend that you upgrade to Java 17.
+
+[#17466](https://github.com/apache/druid/pull/17466)
+
+### Hadoop-based ingestion
+
+Hadoop-based ingestion is now deprecated. We recommend that you migrate to 
SQL-based ingestion. 
+
+#### Join hints in MSQ task engine queries
+
+Druid now supports hints for SQL JOIN queries that use the MSQ task engine. 
This allows queries to provide hints for the JOIN type that should be used at a 
per join level. Join hints recursively affect sub queries. 
+
+```sql
+select /*+ sort_merge */ w1.cityName, w2.countryName
+from
+(
+  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS 
countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = 
w4.regionName
+) w1
+JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
+where w1.cityName='New York';
+```
+
+(#17406)
+
+### New Overlord APIs
+
+APIs for marking segments as used or unused have been moved from the 
Coordinator to the Overlord service:
+
+- Mark all segments of a datasource as unused:
+`POST /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark all (non-overshadowed) segments of a datasource as used:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}`
+
+- Mark multiple segments as used
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed`
+- Mark multiple (non-overshadowed) segments as unused
+`POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused`
+
+- Mark a single segment as used:
+`POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+- Mark a single segment as unused:
+`DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+[#17545](https://github.com/apache/druid/pull/17545)
+
+
+### 17386
+
+https://github.com/apache/druid/pull/17386
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Explore view (experimental)
+
+Several improvements have been made to the Explore view in the web console:
+
+The time chart visualization now supports zooming, dragging, and is smarter 
about granularity detection:
+![](./explore_timechart.png)
+
+
+Filters been improved with helper tables and additional context:
+![](./explore_filtering.png)
+
+Tiles can now be shown side-by-side:
+![](./explore_tiles.png)
+
+[#17627](https://github.com/apache/druid/pull/17627)
+
+#### Segment timeline view
+
+The segment timeline is now more interactive and no longer forces day 
granularity.
+
+**New view**
+![](./webconsole_segment_timeline2.png)
+
+
+**Old view:**
+![](./webconsole_segmenttimeline1.png)
+
+[#17521](https://github.com/apache/druid/pull/17521)
+
 #### Other web console improvements
 
+- The timezoner picker now always shows your timezone 
[#17521](https://github.com/apache/druid/pull/17521)
+- UNNEST is now supported for autocomplete suggestions 
[#17521](https://github.com/apache/druid/pull/17521)
+- Tables now support less than and greater than filters 
[#17521](https://github.com/apache/druid/pull/17521)
+- You can now resize the side panels in the Query view 
[#17387](https://github.com/apache/druid/pull/17387)
+- Added the `expectedLoadTimeMillis` segment loading metric to the web console 
[#17359](https://github.com/apache/druid/pull/17359)
+
 ### Ingestion
 
+#### Numbers for CSV and TSV input formats
+
+Use the new optional config `tryParseNumbers` for CSV and TSV input formats to 
control how numbers are treated. If enabled, any numbers present in the input 
will be parsed in the following manner:
+
+- long data type for integer types and 
+- double for floating-point numbers
+ 
+By default, this configuration is set to false, so numeric strings will be 
treated as strings.
+
+[#17082](https://github.com/apache/druid/pull/17082)
+
+#### Other ingestion improvements
+
+- Reduce the direct memory requirement on non-query processing tasks by not 
reserving query buffers for them 
[#16887](https://github.com/apache/druid/pull/16887)
+- JSON-based and SQL-based ingestion now support request headers when using an 
HTTP input source [#16974](https://github.com/apache/druid/pull/16974)
+ 
 #### SQL-based ingestion
 
 ##### Other SQL-based ingestion improvements
 
+- SQL-based ingestion now supports dynamic parameters for queries besides 
SELECT queries, such as REPLACE 
[#17126](https://github.com/apache/druid/pull/17126)
+- Improved thread names to include the stage ID and worker number to help with 
troubleshooting [#17324](https://github.com/apache/druid/pull/17324)
+
 #### Streaming ingestion
 
+##### Control how many segments get merged for publishing
+
+You can now use the `maxColumsnToMerge` property in your supervisor spec to 
specify the number of segments to merge in a single phase when merging segments 
for publishing. This limit affects the total number of columns present in a set 
of segments to merge. If the limit is exceeded, segment merging occurs in 
multiple phases. Druid merges at least 2 segments each phase, regardless of 
this setting.
+
+[#17030](https://github.com/apache/druid/pull/17030)
+
 ##### Other streaming ingestion improvements
 
+- Druid now properly supports early/late rejection periods when 
`stopTasksCount` is configured and streaming tasks run longer than the 
configured task duration [#17442](https://github.com/apache/druid/pull/17442)
+- Improved segment publishing when resubmitting supervisors or when task 
publishing takes a long time 
[#17509](https://github.com/apache/druid/pull/17509)
+
 ### Querying
 
+#### Window queries
+
+The following fields are deprecated for window queries that use the MSQ task 
engine: `maxRowsMaterializedInWindow` and `partitionColumnNames`. They will be 
removed in a future release.
+
+[#17433](https://github.com/apache/druid/pull/17433)
+
+
+
+
+[#17541](https://github.com/apache/druid/pull/17541)
+
 #### Other querying improvements
 
+- Added automatic query prioritization based on the period of the segments 
scanned in a query. You can set the duration threshold in ISO format using 
`druid.query.scheduler.prioritization.segmentRangeThreshold` 
[#17009](https://github.com/apache/druid/pull/17009)
+- Improved error handling for incomplete queries. A trailer header to indicate 
an error is returned now [#16672](https://github.com/apache/druid/pull/16672)
+- Improved scan queries to account for column types in more situations 
[#17463](https://github.com/apache/druid/pull/17463)
+- Improved lookups so that they can now iterate over fetched data 
[#17212](https://github.com/apache/druid/pull/17212)
+- Improved projections so that they can contain only aggregators and no 
grouping columns [#17484](https://github.com/apache/druid/pull/17484)
+- Removed microseconds as a supported unit for EXTRACT 
[#17247](https://github.com/apache/druid/pull/17247)
+
+
 ### Cluster management
 
+#### Reduced metadata IO
+
+The Overlord runtime property 
`druid.indexer.tasklock.batchAllocationReduceMetadataIO` can help reduce IO 
during segment allocation. Setting this flag to true (default value) allows the 
Overlord to fetch only necessary segment payloads during segment allocation.

Review Comment:
   ```suggestion
   The Overlord runtime property 
`druid.indexer.tasklock.batchAllocationReduceMetadataIO` can help reduce IO 
during batch segment allocation. Setting this flag to true (default value) 
allows the Overlord to fetch only necessary segment payloads during segment 
allocation.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs: 32.0.0 release notes (druid)

Reply via email to