This is an automated email from the ASF dual-hosted git repository.

mcvsubbu pushed a commit to branch 0.2.0
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 981c48041c6a1d764d98b5eb7d68e67ab529c7e4
Author: Neha Pawar <[email protected]>
AuthorDate: Wed Sep 25 12:57:27 2019 -0700

    Documentation for completionConfig (#4636)
---
 docs/architecture.rst                |  7 +++++++
 docs/tableconfig_schema.rst          | 23 +++++++++++++++++++++++
 docs/tuning_realtime_performance.rst |  9 ++++++++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/docs/architecture.rst b/docs/architecture.rst
index c7fc486..03ca092 100644
--- a/docs/architecture.rst
+++ b/docs/architecture.rst
@@ -178,6 +178,13 @@ easy and automated mechanisms for replacing pinot servers, 
or expanding capacity
 `special mechanisms 
<https://cwiki.apache.org/confluence/display/PINOT/Consuming+and+Indexing+rows+in+Realtime#ConsumingandIndexingrowsinRealtime-Segmentcompletionprotocol>`_
 that ensure that the completed segment is equivalent across all replicas.
 
+During segment completion, one winner is chosen by the controller from all the 
replicas as the ``committer server``. The ``committer server`` builds the 
segment and uploads it to the controller. All the other ``non-committer 
servers`` follow one of these two paths:
+
+1. If the in-memory segment is equivalent to the committed segment, the 
``non-committer`` server also builds the segment locally and replaces the 
in-memory segment
+2. If the in-memory segment is non equivalent to the committed segment, the 
``non-committer`` server downloads the segment from the controller.
+
+For more details on this protocol, please refer to `this doc 
<https://cwiki.apache.org/confluence/display/PINOT/Consuming+and+Indexing+rows+in+Realtime#ConsumingandIndexingrowsinRealtime-Segmentcompletionprotocol>`_.
+
 In ``HighLevel`` mode, the servers persist the consumed rows into local store 
(and **not** the segment store). Since consumption of rows
 can be from any partition, it is not possible to guarantee equivalence of 
segments across replicas.
 
diff --git a/docs/tableconfig_schema.rst b/docs/tableconfig_schema.rst
index 9f0c1ce..2a7810e 100644
--- a/docs/tableconfig_schema.rst
+++ b/docs/tableconfig_schema.rst
@@ -113,6 +113,29 @@ The ``segmentsConfig`` section has information about 
configuring the following:
       "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
     },
 
+* Completion Config
+
+  You can also add a ``completionConfig`` section under the ``segmentsConfig`` 
section. Completion config holds information related to realtime segment 
completion. There is just one field in this config as of now, which is the 
``completionMode``. The value of the ``completioMode`` decides how 
non-committers servers should replace the in-memory segment during realtime 
segment completion. Refer to the `Architecture 
<architecture.html#ingesting-realtime-data>`_ for description about committ 
[...]
+
+  By default, if the in memory segment in the non-winner server is equivalent 
to the committed segment, then the non-committer server builds and replaces the 
segment, else it download the segment from the controller.
+
+  Currently, the supported value for ``completionMode`` is
+
+  * ``DOWNLOAD``: In certain scenarios, segment build can get very memory 
intensive. It might become desirable to enforce the non-committer servers to 
just download the segment from the controller, instead of building it again. 
Setting this completionMode ensures that the non-committer servers always 
download the segment.
+
+
+For example:
+
+.. code-block:: none
+
+    "segmentsConfig": {
+      ..
+      ..
+      "completionConfig": {
+        "completionMode": "DOWNLOAD"
+      }
+    },
+
 Table Index Config Section
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/docs/tuning_realtime_performance.rst 
b/docs/tuning_realtime_performance.rst
index fb930f1..fc52a84 100644
--- a/docs/tuning_realtime_performance.rst
+++ b/docs/tuning_realtime_performance.rst
@@ -68,10 +68,17 @@ that will help you to come up with an optimal setting for 
the segment size.
 Moving completed segments to different hosts
 --------------------------------------------
 
-The strutcture of the consuming segments and the completed segments are very 
different. The memory, CPU, I/O
+The structure of the consuming segments and the completed segments are very 
different. The memory, CPU, I/O
 and GC characteristics could be very different while processing queries on 
these segments. Therefore it may be
 useful to move the completed segments onto differnt set of hosts in some use 
cases.
 
 You can host completed segments on a different set of hosts using the 
``tagOverrideConfig`` as described in 
 :ref:`table-config-section`. Pinot will automatically move them once the 
consuming segments are completed.
 
+Completion config
+-----------------
+
+When a realtime segment completes, a winner server is chosen amongst all 
replicas by the controller. That committer server builds the segment and 
uploads to the controller. The non-committer servers are asked to catchup to 
the winning offset. If the non-committer servers are able to catch up, they are 
asked to build the segment and replace the in-memory segment. If they are 
unable to catchup, they are asked to download the segment from the controller.
+
+In certain scenarios, segment build can get very memory intensive. It might 
become desirable to enforce the non-committer servers to just download the 
segment from the controller, instead of building it again. The 
``completionConfig`` as described in :ref:`table-config-section` can be used to 
configure this.
+


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to