[ 
https://issues.apache.org/jira/browse/IMPALA-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556513#comment-17556513
 ] 

ASF subversion and git services commented on IMPALA-10316:
----------------------------------------------------------

Commit 70568c80b3bb19e1945896d0a9492b8bc8f37164 in impala's branch 
refs/heads/master from Laszlo Gaal
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=70568c80b ]

IMPALA-10316: Increase Yarn minimum container size for dataload

This is an attempt to get rod of IMPALA-10669 and friends, crashing Tez
containers during the loading of nested ORC data.

The usual error message logged for these failures is:

Container [pid=11530,containerID=container_1618776748992_0039_01_000003]
is running 2785280B beyond the 'PHYSICAL' memory limit.
Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB
virtual memory used. Killing container.

https://stackoverflow.com/a/43827548/143681 explains that the tunable
setting 'yarn.scheduler.minimum-allocation-mb' in yarn-site.xml sets
both the minimum memory size and the memory size increment for Yarn
containers

This patch is an attempt to work around the failure by forcibly setting
a minimum size for the Yarn containers used in dataload that is
significantly larger than the 1 GB size reported in the failure messages.

Tested by running the dataload phase successfully on the following
platform combinations:
- Ubuntu 16.04, m6i.8xlarge (128 GB RAM, Docker)
- Ubuntu 16.04, m5.12xlarge (192 GB RAM, Docker)
- Centos 7.4, m5.4xlarge (64 GB RAM)
- Centos 7.4, r5.4xlarge (128 GB RAM)
- Ubuntu 16.04, m6i.4xlarge (64 GB RAM)

Change-Id: I77e7c9e9fa3491c6e5652351869d3a4410bbb7b8
Reviewed-on: http://gerrit.cloudera.org:8080/18630
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Reviewed-by: Michael Smith <michael.sm...@cloudera.com>
Reviewed-by: Laszlo Gaal (Cloudera) <laszlo.g...@cloudera.com>


> load_nested.py failed due to out of memory during Jenkins GVO
> -------------------------------------------------------------
>
>                 Key: IMPALA-10316
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10316
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Michael Smith
>            Priority: Critical
>              Labels: broken-build, flaky
>
> The following job failed due to out of memory:
> [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/12588] (please click 
> on "Don't keep this build forever" once this issue is resolved)
> Relevant log lines:
> {noformat}
> 02:33:42 Loading nested orc data (logging to 
> /home/ubuntu/Impala/logs/data_loading/load-nested.log)... 
> 02:35:39     FAILED (Took: 1 min 57 sec)
> 02:35:39     '/home/ubuntu/Impala/testdata/bin/load_nested.py -t 
> tpch_nested_orc_def -f orc/def' failed. Tail of log:
> 02:35:39 2020-11-11 02:35:06,225 INFO:load_nested[348]:Executing: 
> 02:35:39 
> 02:35:39       CREATE EXTERNAL TABLE supplier
> 02:35:39       STORED AS orc
> 02:35:39       TBLPROPERTIES('orc.compress' = 
> 'ZLIB','external.table.purge'='TRUE')
> 02:35:39       AS SELECT * FROM tmp_supplier
> 02:35:39 Traceback (most recent call last):
> 02:35:39   File "/home/ubuntu/Impala/testdata/bin/load_nested.py", line 415, 
> in <module>
> 02:35:39     load()
> 02:35:39   File "/home/ubuntu/Impala/testdata/bin/load_nested.py", line 349, 
> in load
> 02:35:39     hive.execute(stmt)
> 02:35:39   File "/home/ubuntu/Impala/tests/comparison/db_connection.py", line 
> 206, in execute
> 02:35:39     return self._cursor.execute(sql, *args, **kwargs)
> 02:35:39   File 
> "/home/ubuntu/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 331, in execute
> 02:35:39     self._wait_to_finish()  # make execute synchronous
> 02:35:39   File 
> "/home/ubuntu/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py",
>  line 413, in _wait_to_finish
> 02:35:39     raise OperationalError(resp.errorMessage)
> 02:35:39 impala.error.OperationalError: Error while compiling statement: 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1605060173780_0039_2_00, diagnostics=[Task failed, 
> taskId=task_1605060173780_0039_2_00_000000, diagnostics=[TaskAttempt 0 
> failed, info=[Container container_1605060173780_0039_01_000002 finished with 
> diagnostics set to [Container failed, exitCode=-104. [2020-11-11 
> 02:35:11.768]Container 
> [pid=16810,containerID=container_1605060173780_0039_01_000002] is running 
> 7729152B beyond the 'PHYSICAL' memory limit. Current usage: 1.0 GB of 1 GB 
> physical memory used; 2.5 GB of 2.1 GB virtual memory used. Killing 
> container.{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to