[ 
https://issues.apache.org/jira/browse/HAWQ-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665475#comment-15665475
 ] 

ASF GitHub Bot commented on HAWQ-1151:
--------------------------------------

Github user dyozie commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq-docs/pull/60#discussion_r87922081
  
    --- Diff: ddl/ddl-table.html.md.erb ---
    @@ -93,14 +93,14 @@ For any specific query, the first four factors are 
fixed values, while the confi
     
     The `bucketnum` for a hash table specifies the number of hash buckets to 
be used in creating virtual segments. A HASH distributed table is created with 
`default_hash_table_bucket_number` buckets. The default bucket value can be 
changed in session level or in the `CREATE TABLE` DDL by using the `bucketnum` 
storage parameter.
     
    -When initializing a cluster, you can use the `hawq init --bucket_number` 
parameter to explcitly set the default bucket number 
\(`default_hash_table_bucket_number`\).
    +In an Ambari-managed HAWQ cluster, the default bucket number 
\(`default_hash_table_bucket_number`\) is derived from the number of segment 
nodes. In command-line-managed HAWQ environments, you can use the 
`--bucket_number` option of `hawq init` to explicitly set 
`default_hash_table_bucket_number` during cluster initialization.
     
    -**Note:** For best performance with large tables, the number of buckets 
should not exceed the value of the `default_hash_table_bucket_number` 
parameter. Small tables can use one segment node, `with bucketnum=1`. For 
larger tables, the bucketnum is set to a multiple of the number of segment 
nodes, for the best load balancing on different segment nodes. The elastic 
runtime will attempt to find the optimal number of buckets for the number of 
nodes being processed. Larger tables need more virtual segments , and hence use 
larger numbers of buckets.
    +**Note:** For best performance with large tables, the number of buckets 
should not exceed the value of the `default_hash_table_bucket_number` 
parameter. Small tables can use one segment node, `WITH bucketnum=1`. For 
larger tables, the `bucketnum` is set to a multiple of the number of segment 
nodes, for the best load balancing on different segment nodes. The elastic 
runtime will attempt to find the optimal number of buckets for the number of 
nodes being processed. Larger tables need more virtual segments , and hence use 
larger numbers of buckets.
    --- End diff --
    
    Might as well fix (remove) the space before the comma here.


> docs - identify procedures for both ambari and CLI-managed clusters
> -------------------------------------------------------------------
>
>                 Key: HAWQ-1151
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1151
>             Project: Apache HAWQ
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lisa Owen
>            Assignee: David Yozie
>             Fix For: 2.0.1.0-incubating
>
>
> different areas in the docs specify running "hawq config" to change server 
> configuration parameters.  the procedure is different in ambari managed 
> clusters. and, in fact you do NOT want to run `hawq config` in ambari managed 
> clusters.
> add specific notices and ambari procedures in the docs where appropriate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to