[ 
https://issues.apache.org/jira/browse/TAJO-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993143#comment-14993143
 ] 

ASF GitHub Bot commented on TAJO-1962:
--------------------------------------

Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/848#discussion_r44104763
  
    --- Diff: tajo-docs/src/main/sphinx/tsql/variables.rst ---
    @@ -28,35 +30,456 @@ Each client connection to TajoMaster creates a unique 
session, and the client an
     Also, ``\unset key`` will unset the session variable named *key*.
     
     
    -Now, tajo provides the following session variables.
    -
    -* ``DIST_QUERY_BROADCAST_JOIN_THRESHOLD``
    -* ``DIST_QUERY_JOIN_TASK_VOLUME``
    -* ``DIST_QUERY_SORT_TASK_VOLUME``
    -* ``DIST_QUERY_GROUPBY_TASK_VOLUME``
    -* ``DIST_QUERY_JOIN_PARTITION_VOLUME``
    -* ``DIST_QUERY_GROUPBY_PARTITION_VOLUME``
    -* ``DIST_QUERY_TABLE_PARTITION_VOLUME``
    -* ``EXECUTOR_EXTERNAL_SORT_BUFFER_SIZE``
    -* ``EXECUTOR_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_INNER_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_OUTER_HASH_JOIN_SIZE_THRESHOLD``
    -* ``EXECUTOR_GROUPBY_INMEMORY_HASH_THRESHOLD``
    -* ``MAX_OUTPUT_FILE_SIZE``
    -* ``CODEGEN``
    -* ``CLIENT_SESSION_EXPIRY_TIME``
    -* ``CLI_MAX_COLUMN``
    -* ``CLI_NULL_CHAR``
    -* ``CLI_PRINT_PAUSE_NUM_RECORDS``
    -* ``CLI_PRINT_PAUSE``
    -* ``CLI_PRINT_ERROR_TRACE``
    -* ``CLI_OUTPUT_FORMATTER_CLASS``
    -* ``CLI_ERROR_STOP``
    -* ``TIMEZONE``
    -* ``DATE_ORDER``
    -* ``TEXT_NULL``
    -* ``DEBUG_ENABLED``
    -* ``BEHAVIOR_ARITHMETIC_ABORT``
    -* ``RESULT_SET_FETCH_ROWNUM``
    +Currently, tajo provides the following session variables.
    +
    +.. describe:: BROADCAST_NON_CROSS_JOIN_THRESHOLD
    +
    +A threshold for non-cross joins. When a non-cross join query is executed 
with the broadcast join, the whole size of broadcasted tables won't exceed this 
threshold.
    +
    +  * Property value: Integer
    +  * Unit: KB
    +  * Default value: 5120
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set BROADCAST_NON_CROSS_JOIN_THRESHOLD 5120
    +
    +.. describe:: BROADCAST_CROSS_JOIN_THRESHOLD
    +
    +A threshold for cross joins. When a cross join query is executed, the 
whole size of broadcasted tables won't exceed this threshold.
    +
    +  * Property value: Integer
    +  * Unit: KB
    +  * Default value: 1024
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set BROADCAST_CROSS_JOIN_THRESHOLD 1024
    +
    +.. warning::
    +  In Tajo, the broadcast join is only the way to perform cross joins. 
Since the cross join is a very expensive operation, this value need to be tuned 
carefully.
    +
    +.. describe:: JOIN_TASK_INPUT_SIZE
    +
    +The repartition join is executed in two stages. When a join query is 
executed with the repartition join, this value indicates the amount of input 
data processed by each task at the second stage.
    +As a result, it determines the degree of the parallel processing of the 
join query.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set JOIN_TASK_INPUT_SIZE 64
    +
    +.. describe:: JOIN_PER_SHUFFLE_SIZE
    +
    +The repartition join is executed in two stages. When a join query is 
executed with the repartition join,
    +this value indicates the output size of each task at the first stage, 
which determines the number of partitions to be shuffled between two stages.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 128
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set JOIN_PER_SHUFFLE_SIZE 128
    +
    +.. describe:: HASH_JOIN_SIZE_LIMIT
    +
    +This value provides the criterion to decide the algorithm to perform a 
join in a task.
    +If the input data is smaller than this value, join is performed with the 
in-memory hash join.
    +Otherwise, the sort-merge join is used.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set HASH_JOIN_SIZE_LIMIT 64
    +
    +.. warning::
    +  This value is the size of the input stored on file systems. So, when the 
input data is loaded into JVM heap,
    +  its actual size is usually much larger than the configured value, which 
means that too large threshold can cause unexpected OutOfMemory errors.
    +  This value should be tuned carefully.
    +
    +.. describe:: INNER_HASH_JOIN_SIZE_LIMIT
    +
    +This value provides the criterion to decide the algorithm to perform an 
inner join in a task.
    +If the input data is smaller than this value, the inner join is performed 
with the in-memory hash join.
    +Otherwise, the sort-merge join is used.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set INNER_HASH_JOIN_SIZE_LIMIT 64
    +
    +.. warning::
    +  This value is the size of the input stored on file systems. So, when the 
input data is loaded into JVM heap,
    +  its actual size is usually much larger than the configured value, which 
means that too large threshold can cause unexpected OutOfMemory errors.
    +  This value should be tuned carefully.
    +
    +.. describe:: OUTER_HASH_JOIN_SIZE_LIMIT
    +
    +This value provides the criterion to decide the algorithm to perform an 
outer join in a task.
    +If the input data is smaller than this value, the outer join is performed 
with the in-memory hash join.
    +Otherwise, the sort-merge join is used.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set OUTER_HASH_JOIN_SIZE_LIMIT 64
    +
    +.. warning::
    +  This value is the size of the input stored on file systems. So, when the 
input data is loaded into JVM heap,
    +  its actual size is usually much larger than the configured value, which 
means that too large threshold can cause unexpected OutOfMemory errors.
    +  This value should be tuned carefully.
    +
    +.. describe:: JOIN_HASH_TABLE_SIZE
    +
    +The initial size of hash table for in-memory hash join.
    +
    +  * Property value: Integer
    +  * Default value: 100000
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set JOIN_HASH_TABLE_SIZE 100000
    +
    +.. describe:: SORT_TASK_INPUT_SIZE
    +
    +The sort operation is executed in two stages. When a sort query is 
executed, this value indicates the amount of input data processed by each task 
at the second stage.
    +As a result, it determines the degree of the parallel processing of the 
sort query.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set SORT_TASK_INPUT_SIZE 64
    +
    +.. describe:: EXTSORT_BUFFER_SIZE
    +
    +A threshold to choose the sort algorithm. If the input data is larger than 
this threshold, the external sort algorithm is used.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 200
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set EXTSORT_BUFFER_SIZE 200
    +
    +.. describe:: SORT_LIST_SIZE
    +
    +The initial size of list for in-memory sort.
    +
    +  * Property value: Integer
    +  * Default value: 100000
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set SORT_LIST_SIZE 100000
    +
    +.. describe:: GROUPBY_MULTI_LEVEL_ENABLED
    +
    +A flag to enable the multi-level algorithm for distinct aggregation. If 
this value is set, 3-phase aggregation algorithm is used.
    +Otherwise, 2-phase aggregation algorithm is used.
    +
    +  * Property value: Boolean
    +  * Default value: true
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set GROUPBY_MULTI_LEVEL_ENABLED true
    +
    +.. describe:: GROUPBY_PER_SHUFFLE_SIZE
    +
    +The aggregation is executed in two stages. When an aggregation query is 
executed,
    +this value indicates the output size of each task at the first stage, 
which determines the number of partitions to be shuffled between two stages.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 256
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set GROUPBY_PER_SHUFFLE_SIZE 256
    +
    +.. describe:: GROUPBY_TASK_INPUT_SIZE
    +
    +The aggregation operation is executed in two stages. When an aggregation 
query is executed, this value indicates the amount of input data processed by 
each task at the second stage.
    +As a result, it determines the degree of the parallel processing of the 
aggregation query.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set GROUPBY_TASK_INPUT_SIZE 64
    +
    +.. describe:: HASH_GROUPBY_SIZE_LIMIT
    +
    +This value provides the criterion to decide the algorithm to perform an 
aggregation in a task.
    +If the input data is smaller than this value, the aggregation is performed 
with the in-memory hash aggregation.
    +Otherwise, the sort-based aggregation is used.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 64
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set HASH_GROUPBY_SIZE_LIMIT 64
    +
    +.. warning::
    +  This value is the size of the input stored on file systems. So, when the 
input data is loaded into JVM heap,
    +  its actual size is usually much larger than the configured value, which 
means that too large threshold can cause unexpected OutOfMemory errors.
    +  This value should be tuned carefully.
    +
    +.. describe:: AGG_HASH_TABLE_SIZE
    +
    +The initial size of list for in-memory sort.
    +
    +  * Property value: Integer
    +  * Default value: 10000
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set AGG_HASH_TABLE_SIZE 10000
    +
    +.. describe:: TIMEZONE
    +
    +Refer to :doc:`/time_zone`.
    +
    +  * Property value: Time zone id
    +  * Default value: Default time zone of JVM
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set TIMEZONE GMT+9
    +
    +.. describe:: DATE_ORDER
    +
    +Date order specification.
    +
    +  * Property value: One of YMD, DMY, MDY.
    +  * Default value: YMD
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set DATE_ORDER YMD
    +
    +.. describe:: PARTITION_NO_RESULT_OVERWRITE_ENABLED
    +
    +If this value is true, a partitioned table is overwritten even if a 
subquery leads to no result. Otherwise, the table data will be kept if there is 
no result.
    +
    +  * Property value: Boolean
    +  * Default value: false
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set PARTITION_NO_RESULT_OVERWRITE_ENABLED false
    +
    +.. describe:: TABLE_PARTITION_PER_SHUFFLE_SIZE
    +
    +In Tajo, storing a partition table is executed in two stages.
    +This value indicates the output size of a task of the former stage, which 
determines the number of partitions to be shuffled between two stages.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 256
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set TABLE_PARTITION_PER_SHUFFLE_SIZE 256
    +
    +.. describe:: ARITHABORT
    +
    +A flag to indicate how to handle the errors caused by invalid arithmetic 
operations. If true, a running query will be terminated with an overflow or a 
divide-by-zero.
    +
    +  * Property value: Boolean
    +  * Default value: false
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set ARITHABORT false
    +
    +.. describe:: MAX_OUTPUT_FILE_SIZE
    +
    +Maximum per-output file size. 0 means infinite.
    +
    +  * Property value: Integer
    +  * Unit: MB
    +  * Default value: 0
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set MAX_OUTPUT_FILE_SIZE 0
    +
    +.. describe:: SESSION_EXPIRY_TIME
    +
    +Session expiry time.
    +
    +  * Property value: Integer
    +  * Unit: seconds
    +  * Default value: 3600
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set SESSION_EXPIRY_TIME 3600
    +
    +.. describe:: CLI_COLUMNS
    +
    +Sets the width for the wrapped format.
    +
    +  * Property value: Integer
    +  * Default value: 120
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_COLUMNS 120
    +
    +.. describe:: CLI_NULL_CHAR
    +
    +Sets the string to be printed in place of a null value.
    +
    +  * Property value: String
    +  * Default value: ''
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_NULL_CHAR ''
    +
    +.. describe:: CLI_PAGE_ROWS
    +
    +Sets the number of rows for paging.
    +
    +  * Property value: Integer
    +  * Default value: 100
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_PAGE_ROWS 100
    +
    +.. describe:: CLI_PAGING_ENABLED
    +
    +Enable paging of result display.
    +
    +  * Property value: Boolean
    +  * Default value: true
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_PAGING_ENABLED true
    +
    +.. describe:: CLI_DISPLAY_ERROR_TRACE
    +
    +Enable display of error trace.
    +
    +  * Property value: Boolean
    +  * Default value: true
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_DISPLAY_ERROR_TRACE true
    +
    +.. describe:: CLI_FORMATTER_CLASS
    +
    +Sets the output format class to display results.
    +
    +  * Property value: Class name
    +  * Default value: org.apache.tajo.cli.tsql.DefaultTajoCliOutputFormatter
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set CLI_FORMATTER_CLASS 
org.apache.tajo.cli.tsql.DefaultTajoCliOutputFormatter
    +
    +.. describe:: ON_ERROR_STOP
    +
    +tsql will exist if an error occurs.
    +
    +  * Property value: Boolean
    +  * Default value: false
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set ON_ERROR_STOP false
    +
    +.. describe:: NULL_CHAR
    +
    +Null char of text file output.
    +
    +  * Property value: String
    +  * Default value: '\\N'
    +  * Example
    +
    +.. code-block:: sh
    +
    +  \set NULL_CHAR '\\N'
    +
    +.. describe:: DEBUG_ENABLED
    +
    +Debug mode enabled.
    --- End diff --
    
    Thanks for comment. I fixed.


> Add description for session variables
> -------------------------------------
>
>                 Key: TAJO-1962
>                 URL: https://issues.apache.org/jira/browse/TAJO-1962
>             Project: Tajo
>          Issue Type: Task
>          Components: Documentation
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>             Fix For: 0.12.0, 0.11.1
>
>
> Our document (http://tajo.apache.org/docs/devel/tsql/variables.html) only 
> shows the list of session variables. It would be much helpful if we add some 
> description.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to