[ 
https://issues.apache.org/jira/browse/IMPALA-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765657#comment-16765657
 ] 

ASF subversion and git services commented on IMPALA-7214:
---------------------------------------------------------

Commit 697a15b341186046d8fae3a2139f1ad13d304734 in impala's branch 
refs/heads/master from Alex Rodoni
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=697a15b ]

IMPALA-7214: [DOCS] Update Impala docs to decouple Impala and DataNodes

- Take 1: Let's review these docs before we go clean up many more.

Change-Id: I1c91f7975c09dae9908591eeeac0d55e5355b2d4
Reviewed-on: http://gerrit.cloudera.org:8080/12400
Reviewed-by: Alex Rodoni <arod...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Update Impala docs to reflect coordinator/executor separation and decoupling 
> from DataNodes.
> --------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-7214
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7214
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Docs
>    Affects Versions: Impala 2.12.0
>            Reporter: Tim Armstrong
>            Assignee: Alex Rodoni
>            Priority: Major
>
> The docs tend to conflate DataNodes (a HDFS service) and Impala daemons. I 
> think this stems from the original deployment practice of always colocating 
> Impala daemons with HDFS datanodes so that HDFS data could always be read 
> from a local DataNode. 
> I'm a bit pedantic so the conflation feels wrong to me regardless, but I 
> think this will become increasingly confusing as alternative deployments 
> without colocated HDFS DataNodes become more common (e.g. running against S3, 
> running with a separate HDFS service).
> E.g. picking an example at random:
> {noformat}
>         In Impala 1.4.0 and higher, the <codeph>LIMIT</codeph> clause is now 
> optional (rather than required) for
>         queries that use the <codeph>ORDER BY</codeph> clause. Impala 
> automatically uses a temporary disk work area
>         to perform the sort if the sort operation would otherwise exceed the 
> Impala memory limit for a particular
>         DataNode.
> {noformat}
> This is wrong because the memory limit is for an Impala daemon, which is the 
> process that does the actual sorting. So here I think it should be "Impala 
> daemon" instead of "DataNode".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to