[GitHub] [airflow] potiuk commented on a change in pull request #9162: Improve production image iteration speed

2020-06-09 Thread GitBox


potiuk commented on a change in pull request #9162:
URL: https://github.com/apache/airflow/pull/9162#discussion_r437550994



##
File path: scripts/ci/in_container/entrypoint_ci.sh
##
@@ -134,10 +123,6 @@ if [[ ${INTEGRATION_KERBEROS:="false"} == "true" ]]; then
 fi
 
 
-# Start MiniCluster
-java -cp "/opt/minicluster-1.1-SNAPSHOT/*" com.ing.minicluster.MiniCluster \

Review comment:
   Exactly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #9162: Improve production image iteration speed

2020-06-09 Thread GitBox


potiuk commented on a change in pull request #9162:
URL: https://github.com/apache/airflow/pull/9162#discussion_r436891270



##
File path: Dockerfile
##
@@ -153,6 +153,23 @@ ENV PIP_VERSION=${PIP_VERSION}
 
 RUN pip install --upgrade pip==${PIP_VERSION}
 
+ARG AIRFLOW_REPO=apache/airflow
+ENV AIRFLOW_REPO=${AIRFLOW_REPO}
+
+ARG AIRFLOW_BRANCH=master
+ENV AIRFLOW_BRANCH=${AIRFLOW_BRANCH}
+
+ARG AIRFLOW_EXTRAS
+ARG ADDITIONAL_AIRFLOW_EXTRAS=""
+ENV 
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS}
+
+# In case of Production uild image segment we want to pre-install master 
version of airflow

Review comment:
   Nope. It installis it in two steps (all happens in the build segment):
   1} It first pre-installs it from v1-10-test HEAD. This gives the "optimise 
build time" improvement - because it will only re-install what has changed
   2) then it installs airflow from PyPI including requirements.txt taken from 
GitHub  - same version as being installed from PyPi (so when we install 1.10.11 
in the future it will install airflow with the requirements that were 
"snapshot" at 1.10.11 tagging time.
   
   This is all installed with --user flag in the build segment. And then the 
.local dir is copied to main segment. This way we  do not gave double layers 
from installing it first from Github v1-10-test and secondly from PyPI - we 
always copy the "final" set of install files as single layer.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #9162: Improve production image iteration speed

2020-06-08 Thread GitBox


potiuk commented on a change in pull request #9162:
URL: https://github.com/apache/airflow/pull/9162#discussion_r436526224



##
File path: scripts/ci/in_container/entrypoint_ci.sh
##
@@ -163,22 +148,19 @@ ssh-keyscan -H localhost >> ~/.ssh/known_hosts 2>/dev/null
 if [[ ${CI:=} == "true" && ${RUN_TESTS} == "true" ]] ; then
 echo
 echo " 
!"
-echo "  Setting default parallellism to 2 because we can run out of memory 
during tests on CI"
+echo "  Setting default parallelism to 2 because we can run out of memory 
during tests on CI"
 echo " 
!"
 echo
-export AIRFLOW__CORE__PARALELLISM=2
+export AIRFLOW__CORE__PARALLELISM=2
 fi
 
+cd "${AIRFLOW_SOURCES}"
+
 set +u
 # If we do not want to run tests, we simply drop into bash
-if [[ "${RUN_TESTS}" == "false" ]]; then
-if [[ ${#ARGS} == 0 ]]; then
-exec /bin/bash
-else
-exec /bin/bash -c "$(printf "%q " "${ARGS[@]}")"
-fi
+if [[ "${RUN_TESTS:=false}" != "true" ]]; then
+exec /bin/bash "${@}"

Review comment:
   Right. I think it's not needed after recent HIVE mocking. Let's see what 
our tests have to say about it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #9162: Improve production image iteration speed

2020-06-08 Thread GitBox


potiuk commented on a change in pull request #9162:
URL: https://github.com/apache/airflow/pull/9162#discussion_r436516895



##
File path: scripts/ci/in_container/entrypoint_ci.sh
##
@@ -163,22 +148,19 @@ ssh-keyscan -H localhost >> ~/.ssh/known_hosts 2>/dev/null
 if [[ ${CI:=} == "true" && ${RUN_TESTS} == "true" ]] ; then
 echo
 echo " 
!"
-echo "  Setting default parallellism to 2 because we can run out of memory 
during tests on CI"
+echo "  Setting default parallelism to 2 because we can run out of memory 
during tests on CI"
 echo " 
!"
 echo
-export AIRFLOW__CORE__PARALELLISM=2
+export AIRFLOW__CORE__PARALLELISM=2
 fi
 
+cd "${AIRFLOW_SOURCES}"
+
 set +u
 # If we do not want to run tests, we simply drop into bash
-if [[ "${RUN_TESTS}" == "false" ]]; then
-if [[ ${#ARGS} == 0 ]]; then
-exec /bin/bash
-else
-exec /bin/bash -c "$(printf "%q " "${ARGS[@]}")"
-fi
+if [[ "${RUN_TESTS:=false}" != "true" ]]; then
+exec /bin/bash "${@}"

Review comment:
   This way you can do "./breeze shell -- -c "ls" or "./docker run IMAGE 
bash -c "ls" which is exactly the way how Bash's docker is used.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #9162: Improve production image iteration speed

2020-06-08 Thread GitBox


potiuk commented on a change in pull request #9162:
URL: https://github.com/apache/airflow/pull/9162#discussion_r436516086



##
File path: scripts/ci/in_container/entrypoint_ci.sh
##
@@ -163,22 +148,19 @@ ssh-keyscan -H localhost >> ~/.ssh/known_hosts 2>/dev/null
 if [[ ${CI:=} == "true" && ${RUN_TESTS} == "true" ]] ; then
 echo
 echo " 
!"
-echo "  Setting default parallellism to 2 because we can run out of memory 
during tests on CI"
+echo "  Setting default parallelism to 2 because we can run out of memory 
during tests on CI"
 echo " 
!"
 echo
-export AIRFLOW__CORE__PARALELLISM=2
+export AIRFLOW__CORE__PARALLELISM=2
 fi
 
+cd "${AIRFLOW_SOURCES}"
+
 set +u
 # If we do not want to run tests, we simply drop into bash
-if [[ "${RUN_TESTS}" == "false" ]]; then
-if [[ ${#ARGS} == 0 ]]; then
-exec /bin/bash
-else
-exec /bin/bash -c "$(printf "%q " "${ARGS[@]}")"
-fi
+if [[ "${RUN_TESTS:=false}" != "true" ]]; then
+exec /bin/bash "${@}"

Review comment:
   Yep. I am sure. We want to add -c manually if needed. If you do not add 
anything you are simply dropped into shell. This way you can also pass 
additional options to bash - not only -c.
   
   See the examples in BREEZE.rst 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org