(druid) branch master updated: docs: update future development blurbs (#16939)

brile Tue, 01 Oct 2024 15:02:41 -0700

This is an automated email from the ASF dual-hosted git repository.

brile pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git



The following commit(s) were added to refs/heads/master by this push:
     new 1fc82a96bd1 docs: update future development blurbs (#16939)
1fc82a96bd1 is described below

commit 1fc82a96bd19cbf0a02a06284a849fd5d7c725e6
Author: 317brian <53799971+317br...@users.noreply.github.com>
AuthorDate: Tue Oct 1 15:02:05 2024 -0700

    docs: update future development blurbs (#16939)
    
    Co-authored-by: Victoria Lim <vt...@users.noreply.github.com>
---
 docs/design/architecture.md                    |  5 ++---
 docs/design/indexer.md                         |  3 +--
 docs/development/extensions-core/kubernetes.md |  2 +-
 docs/querying/datasource.md                    | 18 +++++++-----------
 4 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/docs/design/architecture.md b/docs/design/architecture.md
index 8887308acd0..ec81477c002 100644
--- a/docs/design/architecture.md
+++ b/docs/design/architecture.md
@@ -105,10 +105,9 @@ for reading from external data sources and publishing new 
Druid segments.
 [**Indexer**](../design/indexer.md) services are an alternative to Middle 
Managers and Peons. Instead of
 forking separate JVM processes per-task, the Indexer runs tasks as individual 
threads within a single JVM process.
 
-The Indexer is designed to be easier to configure and deploy compared to the 
Middle Manager + Peon system and to better enable resource sharing across 
tasks. The Indexer is a newer feature and is currently designated 
[experimental](../development/experimental.md) due to the fact that its memory 
management system is still under
-development. It will continue to mature in future versions of Druid.
+The Indexer is designed to be easier to configure and deploy compared to the 
MiddleManager + Peon system and to better enable resource sharing across tasks, 
which can help streaming ingestion. The Indexer is currently designated 
[experimental](../development/experimental.md).
 
-Typically, you would deploy either Middle Managers or Indexers, but not both.
+Typically, you would deploy one of the following: MiddleManagers, 
[MiddleManager-less ingestion using 
Kubernetes](../development/extensions-contrib/k8s-jobs.md), or Indexers. You 
wouldn't deploy more than one of these options.
 
 ## Colocation of services
 
diff --git a/docs/design/indexer.md b/docs/design/indexer.md
index 9d606d1c9a4..b18408ce389 100644
--- a/docs/design/indexer.md
+++ b/docs/design/indexer.md
@@ -24,8 +24,7 @@ sidebar_label: "Indexer"
   -->
 
 :::info
- The Indexer is an optional and [experimental](../development/experimental.md) 
feature.
- Its memory management system is still under development and will be 
significantly enhanced in later releases.
+ The Indexer is an optional and experimental feature. If you're primarily 
performing batch ingestion, we recommend you use either the MiddleManager and 
Peon task execution system or [MiddleManager-less ingestion using 
Kubernetes](../development/extensions-contrib/k8s-jobs.md). If you're primarily 
doing streaming ingestion, you may want to try either [MiddleManager-less 
ingestion using Kubernetes](../development/extensions-contrib/k8s-jobs.md) or 
the Indexer service.
 :::
 
 The Apache Druid Indexer service is an alternative to the Middle Manager + 
Peon task execution system. Instead of forking a separate JVM process per-task, 
the Indexer runs tasks as separate threads within a single JVM process.
diff --git a/docs/development/extensions-core/kubernetes.md 
b/docs/development/extensions-core/kubernetes.md
index ac66cdda740..25696546dfe 100644
--- a/docs/development/extensions-core/kubernetes.md
+++ b/docs/development/extensions-core/kubernetes.md
@@ -54,7 +54,7 @@ Additionally, this extension has following configuration.
 
 ### Gotchas
 
-- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied 
if there is at least one label/annotation in the pod spec already. This 
limitation may be removed in future.
+- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied 
if there is at least one label/annotation in the pod spec already. 
 - All Druid Pods belonging to one Druid cluster must be inside same kubernetes 
namespace.
 - All Druid Pods need permissions to be able to add labels to self-pod, List 
and Watch other Pods, create and read ConfigMap for leader election. Assuming, 
"default" service account is used by Druid pods, you might need to add 
following or something similar Kubernetes Role and Role Binding.
 
diff --git a/docs/querying/datasource.md b/docs/querying/datasource.md
index 0f033824e10..3cc6265bfb7 100644
--- a/docs/querying/datasource.md
+++ b/docs/querying/datasource.md
@@ -431,25 +431,21 @@ and how to detect it.
 3. One common reason for implicit subquery generation is if the types of the 
two halves of an equality do not match.
 For example, since lookup keys are always strings, the condition `druid.d JOIN 
lookup.l ON d.field = l.field` will
 perform best if `d.field` is a string.
-4. The join operator must evaluate the condition for each row. In the future, 
we expect
-to implement both early and deferred condition evaluation, which we expect to 
improve performance considerably for
-common use cases.
+4. The join operator must evaluate the condition for each row. 
 5. Currently, Druid does not support pushing down predicates (condition and 
filter) past a Join (i.e. into
 Join's children). Druid only supports pushing predicates into the join if they 
originated from
 above the join. Hence, the location of predicates and filters in your Druid 
SQL is very important.
 Also, as a result of this, comma joins should be avoided.
 
-#### Future work for joins
+#### Limitations for joins
 
-Joins are an area of active development in Druid. The following features are 
missing today but may appear in
-future versions:
+Joins in Druid have the following limitations:
 
-- Reordering of join operations to get the most performant plan.
-- Preloaded dimension tables that are wider than lookups (i.e. supporting more 
than a single key and single value).
-- RIGHT OUTER and FULL OUTER joins in the native query engine. Currently, they 
are partially implemented. Queries run
+- The order of joins is not entirely optimized. Join operations are not 
reordered to get the most performant plan.
+- Preloaded dimension tables that are wider than lookups (i.e. supporting more 
than a single key and single value) are not supported.
+- RIGHT OUTER and FULL OUTER joins in the native query engine are not fully 
implemented. Queries run
   but results are not always correct.
-- Performance-related optimizations as mentioned in the [previous 
section](#join-performance).
-- Join conditions on a column containing a multi-value dimension.
+- Join conditions on a column can't contain a multi-value dimension.
 
 ### `unnest`
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

(druid) branch master updated: docs: update future development blurbs (#16939)

Reply via email to