[ 
https://issues.apache.org/jira/browse/HDDS-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863495#comment-16863495
 ] 

Eric Yang commented on HDDS-1495:
---------------------------------

{quote}For the records: it's possible since HDDS-1629{quote}

Sort of, the short cut is hackish.  There is a constant time added to build up 
the ozone directory structure for each build.  The process isn't free.  If the 
content of ozone-${project.version} doesn't need to change, why spend the 35 
seconds?  

Here is time spend for monolithic dist project building docker image (without 
tarball):

{code}
[INFO] --- docker-maven-plugin:0.29.0:build (default) @ hadoop-ozone-dist ---
[INFO] Building tar: 
/home/eyang/hadoop/hadoop-ozone/dist/target/docker/eyang/ozone/0.5.0-SNAPSHOT/tmp/docker-build.tar
[INFO] DOCKER> [eyang/ozone:0.5.0-SNAPSHOT]: Created docker-build.tar in 1 
second 
[INFO] DOCKER> [eyang/ozone:0.5.0-SNAPSHOT]: Built image sha256:0196e
[INFO] DOCKER> [eyang/ozone:0.5.0-SNAPSHOT]: Removed old image sha256:a20fa
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 35.808 s
[INFO] Finished at: 2019-06-13T17:12:11-04:00
[INFO] Final Memory: 44M/748M
[INFO] ------------------------------------------------------------------------
{code}

Here is time for docker with dependency cache:

[INFO] Detected build of image with id 7b3ff8fff705
[INFO] Successfully built apache/ozone:0.5.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 20.298 s
[INFO] Finished at: 2019-06-13T17:30:53-04:00
[INFO] Final Memory: 32M/446M
[INFO] ------------------------------------------------------------------------

It is 43% time saving in my environment.  Dependency copy time is constant time 
and cacheable.  Where rebuild the entire ozone directory structure time changes 
depends on how complex the procedure becoming in the future.


{code}
REPOSITORY             TAG                   IMAGE ID            CREATED        
     SIZE
apache/ozone           0.5.0-SNAPSHOT        7b3ff8fff705        14 minutes ago 
     707MB
eyang/ozone            0.5.0-SNAPSHOT        0196eb46a574        31 minutes ago 
     1.28GB
{code}

The image generated by both process is also significantly different.  
Apache/ozone image was generated using docker submodule.  The eyang/ozone image 
was generated by monolithic build. 
 The test is done without doing any Dockerfile optimization.  There is still 
plenty space optimization and time reduction when Docker layers are further 
optimized.  The monolithic build process will only get more complicated if more 
stuff are dump into dist project.  Please consider the request to make the 
build environment more workable for docker developers.

> Create hadoop/ozone docker images with inline build process
> -----------------------------------------------------------
>
>                 Key: HDDS-1495
>                 URL: https://issues.apache.org/jira/browse/HDDS-1495
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Elek, Marton
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: HADOOP-16091.001.patch, HADOOP-16091.002.patch, 
> HDDS-1495.003.patch, HDDS-1495.004.patch, HDDS-1495.005.patch, 
> HDDS-1495.006.patch, HDDS-1495.007.patch, Hadoop Docker Image inline build 
> process.pdf
>
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making docker build inline if possible.
> {quote}
> The main challenges are also discussed in the thread:
> {code:java}
> 3. Technically it would be possible to add the Dockerfile to the source
> tree and publish the docker image together with the release by the
> release manager but it's also problematic:
> {code}
> a) there is no easy way to stage the images for the vote
>  c) it couldn't be flagged as automated on dockerhub
>  d) It couldn't support the critical updates.
>  * Updating existing images (for example in case of an ssl bug, rebuild
>  all the existing images with exactly the same payload but updated base
>  image/os environment)
>  * Creating image for older releases (We would like to provide images,
>  for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
>  with different versions).
> {code:java}
>  {code}
> The a) can be solved (as [~eyang] suggested) with using a personal docker 
> image during the vote and publish it to the dockerhub after the vote (in case 
> the permission can be set by the INFRA)
> Note: based on LEGAL-270 and linked discussion both approaches (inline build 
> process / external build process) are compatible with the apache release.
> Note: HDDS-851 and HADOOP-14898 contains more information about these 
> problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to