Github user eminency commented on a diff in the pull request:
https://github.com/apache/tajo/pull/848#discussion_r44101925
--- Diff: tajo-docs/src/main/sphinx/tsql/variables.rst ---
@@ -28,35 +30,456 @@ Each client connection to TajoMaster creates a unique
session, and the client an
Also, ``\unset key`` will unset the session variable named *key*.
-Now, tajo provides the following session variables.
-
-* ``DIST_QUERY_BROADCAST_JOIN_THRESHOLD``
-* ``DIST_QUERY_JOIN_TASK_VOLUME``
-* ``DIST_QUERY_SORT_TASK_VOLUME``
-* ``DIST_QUERY_GROUPBY_TASK_VOLUME``
-* ``DIST_QUERY_JOIN_PARTITION_VOLUME``
-* ``DIST_QUERY_GROUPBY_PARTITION_VOLUME``
-* ``DIST_QUERY_TABLE_PARTITION_VOLUME``
-* ``EXECUTOR_EXTERNAL_SORT_BUFFER_SIZE``
-* ``EXECUTOR_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_INNER_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_OUTER_HASH_JOIN_SIZE_THRESHOLD``
-* ``EXECUTOR_GROUPBY_INMEMORY_HASH_THRESHOLD``
-* ``MAX_OUTPUT_FILE_SIZE``
-* ``CODEGEN``
-* ``CLIENT_SESSION_EXPIRY_TIME``
-* ``CLI_MAX_COLUMN``
-* ``CLI_NULL_CHAR``
-* ``CLI_PRINT_PAUSE_NUM_RECORDS``
-* ``CLI_PRINT_PAUSE``
-* ``CLI_PRINT_ERROR_TRACE``
-* ``CLI_OUTPUT_FORMATTER_CLASS``
-* ``CLI_ERROR_STOP``
-* ``TIMEZONE``
-* ``DATE_ORDER``
-* ``TEXT_NULL``
-* ``DEBUG_ENABLED``
-* ``BEHAVIOR_ARITHMETIC_ABORT``
-* ``RESULT_SET_FETCH_ROWNUM``
+Currently, tajo provides the following session variables.
+
+.. describe:: BROADCAST_NON_CROSS_JOIN_THRESHOLD
+
+A threshold for non-cross joins. When a non-cross join query is executed
with the broadcast join, the whole size of broadcasted tables won't exceed this
threshold.
+
+ * Property value: Integer
+ * Unit: KB
+ * Default value: 5120
+ * Example
+
+.. code-block:: sh
+
+ \set BROADCAST_NON_CROSS_JOIN_THRESHOLD 5120
+
+.. describe:: BROADCAST_CROSS_JOIN_THRESHOLD
+
+A threshold for cross joins. When a cross join query is executed, the
whole size of broadcasted tables won't exceed this threshold.
+
+ * Property value: Integer
+ * Unit: KB
+ * Default value: 1024
+ * Example
+
+.. code-block:: sh
+
+ \set BROADCAST_CROSS_JOIN_THRESHOLD 1024
+
+.. warning::
+ In Tajo, the broadcast join is only the way to perform cross joins.
Since the cross join is a very expensive operation, this value need to be tuned
carefully.
+
+.. describe:: JOIN_TASK_INPUT_SIZE
+
+The repartition join is executed in two stages. When a join query is
executed with the repartition join, this value indicates the amount of input
data processed by each task at the second stage.
+As a result, it determines the degree of the parallel processing of the
join query.
+
+ * Property value: Integer
+ * Unit: MB
+ * Default value: 64
+ * Example
+
+.. code-block:: sh
+
+ \set JOIN_TASK_INPUT_SIZE 64
+
+.. describe:: JOIN_PER_SHUFFLE_SIZE
--- End diff --
I think this name looks weird. It lets me think it is like 'number of join
task for each shuffle (size)'.
But I have no idea for good alternative.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---