[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-23 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Patch Set 1:

I believe applications can find out how much memory is available to their 
container by looking at cgroup structures. Impala does that, so maybe Hive/Tez 
do the same. If they do, they would know they are running in a container and 
use the smaller memory limit. If they don't, then this won't work.


--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Fri, 23 Apr 2021 16:16:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-23 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has abandoned this change. ( http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Abandoned

Additional tests proved that this is not sufficient on its own. Memory cgroup 
documentation also suggests that limiting container memory will not be 
reflected in the APIs userland programs routinely call for querying system/free 
memory (e.g. free(1)), so this is unlikely to influence how Hive & Tez manage 
Tez container memory.
--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-22 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17331/1/docker/test-with-docker.py
File docker/test-with-docker.py:

http://gerrit.cloudera.org:8080/#/c/17331/1/docker/test-with-docker.py@648
PS1, Line 648: # Tez can get confused when creating ORC files with 
nested structures
 : # on machines with lots of RAM, e.g. m5.12xlarge. 
Limiting the available memory
 : # seems to avoid this problem, so limit container memory 
just for the build
 : # container. 128GB is still much more than dataload 
needs, but does the trick.
 : extras=["-m", "128g"],
> Should we make this conditional on the machine having more than 128g of mem
My (admittedly) assumption was that on a machine that has less memory this 
option would be a no-op. I'll check/research.



--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Thu, 22 Apr 2021 09:56:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-21 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17331/1/docker/test-with-docker.py
File docker/test-with-docker.py:

http://gerrit.cloudera.org:8080/#/c/17331/1/docker/test-with-docker.py@648
PS1, Line 648: # Tez can get confused when creating ORC files with 
nested structures
 : # on machines with lots of RAM, e.g. m5.12xlarge. 
Limiting the available memory
 : # seems to avoid this problem, so limit container memory 
just for the build
 : # container. 128GB is still much more than dataload 
needs, but does the trick.
 : extras=["-m", "128g"],
Should we make this conditional on the machine having more than 128g of memory?



--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Wed, 21 Apr 2021 23:30:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8618/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Wed, 21 Apr 2021 19:31:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-21 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17331 )

Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..


Patch Set 1:

Ran this on an m5.12xlarge with three different OSs: Ubuntu 16.04, Ubuntu 18.04 
and Centos 7, running just FE_TESTs and JDBC_TEST. All passed, so the patch 
does not trigger the planner test discrepancy seen earlier.


--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Wed, 21 Apr 2021 19:13:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10669: Limit memory during dataload in Docker-based tests

2021-04-21 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17331


Change subject: IMPALA-10669: Limit memory during dataload in Docker-based tests
..

IMPALA-10669: Limit memory during dataload in Docker-based tests

When Docker-based tests are run on machines with lots of RAM (e.g
AWS m5.12xlarge with 192GB of RAM), creating ORC files with complex
types often caused Tez containers to attempt to exceed their memory
limits. This resulted in Yarn terminating such containers, making the
dataload phase and the shole build fail.

Since this type of failure was only observed on hosts with very high
amounts of memory, the workaround is to put an artificial constrain on
the memory available for the build container (this is the Docker
container in which Impala is built and the initial dataload is
performed). Memory for this container is limited to 128GB, which is
still much more than dataload requires. It also matches the native
memory size of an AWS r5.4xlarge instance, which routinely runs Impala
builds in our downstream environment.

Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
---
M docker/test-with-docker.py
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/17331/1
--
To view, visit http://gerrit.cloudera.org:8080/17331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5100589ffc54184b6ca13db139cb983175c934eb
Gerrit-Change-Number: 17331
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal