[
https://issues.apache.org/jira/browse/TAJO-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993062#comment-14993062
]
ASF GitHub Bot commented on TAJO-1962:
--------------------------------------
Github user hyunsik commented on a diff in the pull request:
https://github.com/apache/tajo/pull/848#discussion_r44102768
--- Diff: tajo-docs/src/main/sphinx/tsql/variables.rst ---
@@ -28,35 +30,456 @@ Each client connection to TajoMaster creates a unique
session, and the client an
Also, ``\unset key`` will unset the session variable named *key*.
-Now, tajo provides the following session variables.
-
-* ``DIST_QUERY_BROADCAST_JOIN_THRESHOLD``
-* ``DIST_QUERY_JOIN_TASK_VOLUME``
-* ``DIST_QUERY_SORT_TASK_VOLUME``
-* ``DIST_QUERY_GROUPBY_TASK_VOLUME``
-* ``DIST_QUERY_JOIN_PARTITION_VOLUME``
-* ``DIST_QUERY_GROUPBY_PARTITION_VOLUME``
-* ``DIST_QUERY_TABLE_PARTITION_VOLUME``
-* ``EXECUTOR_EXTERNAL_SORT_BUFFER_SIZE``
-* ``EXECUTOR_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_INNER_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_OUTER_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_GROUPBY_INMEMORY_HASH_THRESHOLD``
-* ``MAX_OUTPUT_FILE_SIZE``
-* ``CODEGEN``
-* ``CLIENT_SESSION_EXPIRY_TIME``
-* ``CLI_MAX_COLUMN``
-* ``CLI_NULL_CHAR``
-* ``CLI_PRINT_PAUSE_NUM_RECORDS``
-* ``CLI_PRINT_PAUSE``
-* ``CLI_PRINT_ERROR_TRACE``
-* ``CLI_OUTPUT_FORMATTER_CLASS``
-* ``CLI_ERROR_STOP``
-* ``TIMEZONE``
-* ``DATE_ORDER``
-* ``TEXT_NULL``
-* ``DEBUG_ENABLED``
-* ``BEHAVIOR_ARITHMETIC_ABORT``
-* ``RESULT_SET_FETCH_ROWNUM``
+Currently, tajo provides the following session variables.
+
+.. describe:: BROADCAST_NON_CROSS_JOIN_THRESHOLD
+
+A threshold for non-cross joins. When a non-cross join query is executed
with the broadcast join, the whole size of broadcasted tables won't exceed this
threshold.
+
+ * Property value: Integer
+ * Unit: KB
+ * Default value: 5120
+ * Example
+
+.. code-block:: sh
+
+ \set BROADCAST_NON_CROSS_JOIN_THRESHOLD 5120
+
+.. describe:: BROADCAST_CROSS_JOIN_THRESHOLD
+
+A threshold for cross joins. When a cross join query is executed, the
whole size of broadcasted tables won't exceed this threshold.
+
+ * Property value: Integer
+ * Unit: KB
+ * Default value: 1024
+ * Example
+
+.. code-block:: sh
+
+ \set BROADCAST_CROSS_JOIN_THRESHOLD 1024
+
+.. warning::
+ In Tajo, the broadcast join is only the way to perform cross joins.
Since the cross join is a very expensive operation, this value need to be tuned
carefully.
+
+.. describe:: JOIN_TASK_INPUT_SIZE
+
+The repartition join is executed in two stages. When a join query is
executed with the repartition join, this value indicates the amount of input
data processed by each task at the second stage.
+As a result, it determines the degree of the parallel processing of the
join query.
+
+ * Property value: Integer
+ * Unit: MB
+ * Default value: 64
+ * Example
+
+.. code-block:: sh
+
+ \set JOIN_TASK_INPUT_SIZE 64
+
+.. describe:: JOIN_PER_SHUFFLE_SIZE
--- End diff --
Actually, join is processed in the second stage. The first stage is for
shuffle. This parameters only determines the input size of the second stages.
So, this name is proper in my opinion.
In addition, this parameter has a similar context to map (reduce) task
size. So, it would be familiar to those who have Hadoop experiences.
> Add description for session variables
> -------------------------------------
>
> Key: TAJO-1962
> URL: https://issues.apache.org/jira/browse/TAJO-1962
> Project: Tajo
> Issue Type: Task
> Components: Documentation
> Reporter: Jihoon Son
> Assignee: Jihoon Son
> Fix For: 0.12.0, 0.11.1
>
>
> Our document (http://tajo.apache.org/docs/devel/tsql/variables.html) only
> shows the list of session variables. It would be much helpful if we add some
> description.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)