Github user eminency commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/848#discussion_r44101976
  
    --- Diff: tajo-docs/src/main/sphinx/tsql/variables.rst ---
    @@ -28,35 +30,456 @@ Each client connection to TajoMaster creates a unique 
session, and the client an
     Also, ``\unset key`` will unset the session variable named *key*.
     
     
    -Now, tajo provides the following session variables.
    -
    -* ``DIST_QUERY_BROADCAST_JOIN_THRESHOLD``
    -* ``DIST_QUERY_JOIN_TASK_VOLUME``
    -* ``DIST_QUERY_SORT_TASK_VOLUME``
    -* ``DIST_QUERY_GROUPBY_TASK_VOLUME``
    -* ``DIST_QUERY_JOIN_PARTITION_VOLUME``
    -* ``DIST_QUERY_GROUPBY_PARTITION_VOLUME``
    -* ``DIST_QUERY_TABLE_PARTITION_VOLUME``
    -* ``EXECUTOR_EXTERNAL_SORT_BUFFER_SIZE``
    -* ``EXECUTOR_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_INNER_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_OUTER_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_GROUPBY_INMEMORY_HASH_THRESHOLD``
    -* ``MAX_OUTPUT_FILE_SIZE``
    -* ``CODEGEN``
    -* ``CLIENT_SESSION_EXPIRY_TIME``
    -* ``CLI_MAX_COLUMN``
    -* ``CLI_NULL_CHAR``
    -* ``CLI_PRINT_PAUSE_NUM_RECORDS``
    -* ``CLI_PRINT_PAUSE``
    -* ``CLI_PRINT_ERROR_TRACE``
    -* ``CLI_OUTPUT_FORMATTER_CLASS``
    -* ``CLI_ERROR_STOP``
    -* ``TIMEZONE``
    -* ``DATE_ORDER``
    -* ``TEXT_NULL``
    -* ``DEBUG_ENABLED``
    -* ``BEHAVIOR_ARITHMETIC_ABORT``
    -* ``RESULT_SET_FETCH_ROWNUM``
    +Currently, tajo provides the following session variables.
    +
    +.. describe:: BROADCAST_NON_CROSS_JOIN_THRESHOLD
    +
    +A threshold for non-cross joins. When a non-cross join query is executed 
with the broadcast join, the whole size of broadcasted tables won't exceed this 
threshold.
    +
    +  * Property value: Integer
    +  * Unit: KB
    +  * Default value: 5120
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set BROADCAST_NON_CROSS_JOIN_THRESHOLD 5120
    +
    +.. describe:: BROADCAST_CROSS_JOIN_THRESHOLD
    +
    +A threshold for cross joins. When a cross join query is executed, the 
whole size of broadcasted tables won't exceed this threshold.
    +
    +  * Property value: Integer
    +  * Unit: KB
    +  * Default value: 1024
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set BROADCAST_CROSS_JOIN_THRESHOLD 1024
    +
    +.. warning::
    +  In Tajo, the broadcast join is only the way to perform cross joins. 
Since the cross join is a very expensive operation, this value need to be tuned 
carefully.
    +
    +.. describe:: JOIN_TASK_INPUT_SIZE
    +
    +The repartition join is executed in two stages. When a join query is 
executed with the repartition join, this value indicates the amount of input 
data processed by each task at the second stage.
    +As a result, it determines the degree of the parallel processing of the 
join query.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set JOIN_TASK_INPUT_SIZE 64
    +
    +.. describe:: JOIN_PER_SHUFFLE_SIZE
    +
    +The repartition join is executed in two stages. When a join query is 
executed with the repartition join,
    +this value indicates the output size of each task at the first stage, 
which determines the number of partitions to be shuffled between two stages.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 128
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set JOIN_PER_SHUFFLE_SIZE 128
    +
    +.. describe:: HASH_JOIN_SIZE_LIMIT
    --- End diff --
    
    Corresponding config property name uses 'threshold', but this one is 
'limit'.
    I think it needs some kind of consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to