keith-turner opened a new pull request, #20:
URL: https://github.com/apache/accumulo-docker/pull/20
This PR is intentionally incomplete as I am seeking it improve a problem I
see but I am not sure this is the best approach.
In the past when testing compactor and scan servers running Kubernetes I
would go through the following process.
1. clone accumulo docker
2. manually download hadoop and zookeeper
3. build a snapshot version of accumulo
4. build an accumulo docker image
5. push the docker image to a container repository that the Kubernetes
cluster can pull from
6. restart the accumulo processes running in Kubernetes
7. run some experiments and make some changes to accumulo and then go to
step 3
Step 4 above takes multiple minutes and creates a 2GB images. Because the
image is so large it makes steps 5 and 6 take a while as the image is uploaded
and then downloaded from the repo. This PR works around these problems by
doing the following.
* Move the code to download needed deps outside of the docker build file.
This saves me time from manually downloading in steo 2 above.
* Split the docker build file into two build files. The first one builds a
base image with java,hadoop,zookeeper. The seond extends the first and only
has to include Accumulo.
With the above changes I can have the following workflow.
1. clone accumulo docker
2. run download script to get hadoop and zookeeper files
3. build the accumulo-base docker image that includes java, hadoop, and, zk
4. build a snapshot version of accumulo
5. build the accumulo docker image that extends accumulo-base and includes
accumulo
6. push the docker image to a container repository that the Kubernetes
cluster can pull from
7. restart the accumulo processes running in Kubernetes
8. run some experiments and make some changes to accumulo and then go to
step 4
Step 5 above takes a few seconds (vs a few minutes) and produces a new image
where the layers on top of accumulo-base are only ~30MB (can see this with
docker history command). The first times step 6 and 7 run, the large
accumulo-base image will have to be uploaded and downloaded. However on
subsequent runs of step 6 and 7 only ~30MB needs to be uploaded and downloaded,
making those steps much much faster.
This is a huge improvement for what I am trying to do. I did just enough
work to get this functioning. Before updating the readme, improving the docker
file, and download script I would like to see if anyone has feedback.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]