Thanks Eric the answers.

If I understood well, these are two proposals (use the same repository,
use inline build). I created separated jiras for both of them where we
can discuss the technical details:

https://issues.apache.org/jira/browse/HADOOP-16092

https://issues.apache.org/jira/browse/HADOOP-16091


Until the implementation of the jiras we can use the existing approach,
but (again) I am fine with switching to any newer approach anytime. The
only thing what we need is the availability of the images during any
transition.


I started to document the current state in the wiki to make the
discussion easier.

https://cwiki.apache.org/confluence/display/HADOOP/Container+support

https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Container+support

Marton




On 1/31/19 8:00 PM, Eric Yang wrote:
> 1, 3. There are 38 Apache projects hosting docker images on Docker hub using 
> Apache Organization.  By browsing Apache github mirror.  There are only 7 
> projects using a separate repository for docker image build.  Popular 
> projects official images are not from Apache organization, such as zookeeper, 
> tomcat, httpd.  We may not disrupt what other Apache projects are doing, but 
> it looks like inline build process is widely employed by majority of projects 
> such as Nifi, Brooklyn, thrift, karaf, syncope and others.  The situation 
> seems a bit chaotic for Apache as a whole.  However, Hadoop community can 
> decide what is best for Hadoop.  My preference is to remove ozone from source 
> tree naming, if Ozone is intended to be subproject of Hadoop for long period 
> of time.  This enables Hadoop community to host docker images for various 
> subproject without having to check out several source tree to trigger a grand 
> build.  However, inline build process seems more popular than separated 
> process.  Hence, I highly recommend making docker build inline if possible.
> 
> 2. I think open an INFRA ticket, and there are Jenkins users who can 
> configure the job to run on nodes that have Apache repo credential.
> 
> 4. The docker image name maps to maven project name.  Hence, if it is 
> Hadoop-ozone as project name.  The convention automatically follows the maven 
> artifact name with option to customize.  I think it is reasonable and it 
> automatically tagged with the same maven project version, which minimize 
> version number management between maven and docker.
> 
> Regards,
> Eric
> 
> On 1/31/19, 8:59 AM, "Elek, Marton" <e...@apache.org> wrote:
> 
>     
>     Hi Eric,
>     
>     Thanks for the answers
>     
>     1.
>     
>     > Hadoop-docker-ozone.git source tree naming seems to create a unique
>     process for Ozone.
>     
>     Not at all. We would like to follow the existing practice which is
>     established in HADOOP-14898. In HDDS-851 we discussed why we need two
>     separated repositories for hadoop/ozone: because the limitation of the
>     dockerhub branch/tag mapping.
>     
>     I am 100% open to switch to use an other approach. I would suggest to
>     create a JIRA for that as it requires code modification in the
>     docker-hadoop-* branches.
>     
>     
>     2.
>     
>     > Flagging automated build on dockerhub seems conflicts with Apache
>     release policy.
>     
>     Honestly I don't know. It was discussed in HADOOP-14989 and the
>     connected INFRA ticket and there was no arguments against it. Especially
>     as we just followed the existing practice and we just followed the
>     practice which is started by other projects.
>     
>     Now I checked again the docker related INFRA tickets it seems that we
>     have two other practice since than:
>     
>      1) build docker image on the jenkins (is it compliant?)
>      2) get permission to push to the apache/... from local.
>     
>     You suggested to the the second one. Do you have more information how is
>     it possible? How and who can request permission to push the
>     apache/hadoop for example?
>     
>     
>     3.
>     
>     From one point of view, publishing existing, voted releases in docker
>     images is something like to repackage it. But you may have right and
>     this is wrong because it should be handled as separated releases.
>     
>     Do you know any official ASF wiki/doc/mail discussion about managing
>     docker images? If not, I would suggest to create a new wiki/doc as it
>     seems that we have no clear answer which is the most compliant way to do 
> it.
>     
>     4.
>     
>     Thank you the suggestions to use dockerhub/own namespace to stage docker
>     images during the build. Sounds good to me. But I also wrote some other
>     problems in my previous mail (3 b,c,d), this is is just one (3/a). Do
>     you have any suggestion to solve the other problems?
>     
>      * Updating existing images (for example in case of an ssl bug, rebuild
>     all the existing images with exactly the same payload but updated base
>     image/os environment)
>     
>      * Creating image for older releases (We would like to provide images,
>     for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
>     with different versions).
>     
>     Thanks a lot,
>     Marton
>     
>     
>     On 1/30/19 6:50 PM, Eric Yang wrote:
>     > Hi Marton,
>     > 
>     > Hi Marton,
>     > 
>     > Flagging automated build on dockerhub seems conflicts with Apache 
> release policy.  The vote and release process are manual processes of Apache 
> Way.  Therefore, 3 b)-3 d) improvement will be out of reach unless policy 
> changes.
>     > 
>     > YARN-7129 is straight forward by using dockerfile-maven-plugin to build 
> docker image locally.  It also checks for existence of /var/run/docker.sock 
> to ensure docker is running.  This allows the docker image to build in 
> developer sandbox, if the developer sandbox mounts the host 
> /var/run/docker.sock.  Maven deploy can configure repository location and 
> authentication credential using ~/.docker/config.json and maven settings.xml. 
>  This can upload release candidate image to release manager's dockerhub 
> account for release vote.  Once the vote passes, the image can be pushed to 
> Apache official dockerhub repository by release manager or an Apache Jenkin 
> job to tag the image and push to Apache account.
>     > 
>     > Ozone image and application catalog image are in similar situation that 
> test image can be built and tested locally.  The official voted artifacts can 
> be uploaded to Apache dockerhub account.  Hence, less variant of the same 
> procedure will be great.  Hadoop-docker-ozone.git source tree naming seems to 
> create a unique process for Ozone.  I think it would be preferable to call 
> the Hadoop-docker.git that comprise all docker image builds or 
> dockerfile-maven-plugin approach.
>     > 
>     > Regards,
>     > Eric
>     > 
>     > On 1/30/19, 12:56 AM, "Elek, Marton" <e...@apache.org> wrote:
>     > 
>     >     Thanks Eric the suggestions.
>     >     
>     >     Unfortunately (as Anu wrote it) our use-case is slightly different.
>     >     
>     >     It was discussed in HADOOP-14898 and HDDS-851 but let me summarize 
> the
>     >     motivation:
>     >     
>     >     We would like to upload containers to the dockerhub for each 
> releases
>     >     (eg: apache/hadoop:3.2.0)
>     >     
>     >     According to the Apache release policy, it's not allowed, to publish
>     >     snapshot builds (=not voted by PMC) outside of the developer 
> community.
>     >     
>     >     1. We started to follow the pattern which is used by other Apache
>     >     projects: docker containers are just different packaging of the 
> already
>     >     voted binary releases. Therefore we create the containers from the 
> voted
>     >     releases. (See [1] as an example)
>     >     
>     >     2. With separating the build of the source code and the docker 
> image we
>     >     get additional benefits: for example we can rebuild the images in 
> case
>     >     of a security problem in the underlying container OS. This is just 
> a new
>     >     empty commit on the branch and the original release will be 
> repackaged.
>     >     
>     >     3. Technically it would be possible to add the Dockerfile to the 
> source
>     >     tree and publish the docker image together with the release by the
>     >     release manager but it's also problematic:
>     >     
>     >       a) there is no easy way to stage the images for the vote
>     >       b) we have no access to the apache dockerhub credentials
>     >       c) it couldn't be flagged as automated on dockerhub
>     >       d) It couldn't support the critical updates as I wrote in (2.).
>     >     
>     >     So the easy way what we found is ask INFRA to register a branch to 
> the
>     >     dockerhub to use for the image creation. The build/packaging will be
>     >     done by the dockerhub but only released artifacts will be included.
>     >     Because the limitation of the dockerhub to set a map between branch
>     >     names and tags, we need a new repository instead of the branch (see 
> the
>     >     comments in HDDS-851 for more details).
>     >     
>     >     We also have a different use case to build developer images to 
> create a
>     >     test cluster. These images will never be uploaded to the hub. We 
> have a
>     >     Dokcerfile in the source tree for this use case (see HDDS-872). And
>     >     thank you very much the hint, I will definitely check how YARN-7129 
> can
>     >     do it and will try to learn from it.
>     >     
>     >     Thanks,
>     >     Marton
>     >     
>     >     
>     >     [1]: https://github.com/apache/hadoop/tree/docker-hadoop-3
>     >     
>     >     
>     >     
>     >     On 1/30/19 2:50 AM, Anu Engineer wrote:
>     >     > Marton please correct me I am wrong, but I believe that without 
> this branch it is hard for us to push to Apache DockerHub. This allows for 
> Apache account integration and dockerHub.
>     >     > Does YARN publish to the Docker Hub via Apache account?
>     >     > 
>     >     > 
>     >     > Thanks
>     >     > Anu
>     >     > 
>     >     > 
>     >     > On 1/29/19, 4:54 PM, "Eric Yang" <ey...@hortonworks.com> wrote:
>     >     > 
>     >     >     By separating Hadoop docker related build into a separate git 
> repository have some slippery slope.  It is harder to synchronize the changes 
> between two separate source trees.  There is multi-steps process to build 
> jar, tarball, and docker images.  This might be problematic to reproduce.
>     >     >     
>     >     >     It would be best to arrange code such that docker image build 
> process can be invoked as part of maven build process.  The profile is 
> activated only if docker is installed and running on the environment.  This 
> allows to produce jar, tarball, and docker images all at once without 
> hindering existing build procedure.
>     >     >     
>     >     >     YARN-7129 is one of the examples that making a subproject in 
> YARN to build a docker image that can run in YARN.  It automatically detects 
> presence of docker and build docker image when docker is available.  If 
> docker is not running, the subproject skips and proceed to next sub-project.  
> Please try out YARN-7129 style of build process, and see this is a possible 
> solution to solve docker image generation issue?  Thanks
>     >     >     
>     >     >     Regards,
>     >     >     Eric
>     >     >     
>     >     >     On 1/29/19, 3:44 PM, "Arpit Agarwal" 
> <aagar...@cloudera.com.INVALID> wrote:
>     >     >     
>     >     >         I’ve requested a new repo hadoop-docker-ozone.git in 
> gitbox.
>     >     >         
>     >     >         
>     >     >         > On Jan 22, 2019, at 4:59 AM, Elek, Marton 
> <e...@apache.org> wrote:
>     >     >         > 
>     >     >         > 
>     >     >         > 
>     >     >         > TLDR;
>     >     >         > 
>     >     >         > I proposed to create a separated git repository for 
> ozone docker images
>     >     >         > in HDDS-851 (hadoop-docker-ozone.git)
>     >     >         > 
>     >     >         > If there is no objections in the next 3 days I will ask 
> an Apache Member
>     >     >         > to create the repository.
>     >     >         > 
>     >     >         > 
>     >     >         > 
>     >     >         > 
>     >     >         > LONG VERSION:
>     >     >         > 
>     >     >         > In HADOOP-14898 multiple docker containers and helper 
> scripts are
>     >     >         > created for Hadoop.
>     >     >         > 
>     >     >         > The main goal was to:
>     >     >         > 
>     >     >         > 1.) help the development with easy-to-use docker images
>     >     >         > 2.) provide official hadoop images to make it easy to 
> test new features
>     >     >         > 
>     >     >         > As of now we have:
>     >     >         > 
>     >     >         > - apache/hadoop-runner image (which contains the 
> required dependency
>     >     >         > but no hadoop)
>     >     >         > - apache/hadoop:2 and apache/hadoop:3 images (to try 
> out latest hadoop
>     >     >         > from 2/3 lines)
>     >     >         > 
>     >     >         > The base image to run hadoop (apache/hadoop-runner) is 
> also heavily used
>     >     >         > for Ozone distribution/development.
>     >     >         > 
>     >     >         > The Ozone distribution contains docker-compose based 
> cluster definitions
>     >     >         > to start various type of clusters and scripts to do 
> smoketesting. (See
>     >     >         > HADOOP-16063 for more details).
>     >     >         > 
>     >     >         > Note: I personally believe that these definitions help 
> a lot to start
>     >     >         > different type of clusters. For example it could be 
> tricky to try out
>     >     >         > router based federation as it requires multiple HA 
> clusters. But with a
>     >     >         > simple docker-compose definition [1] it could be 
> started under 3
>     >     >         > minutes. (HADOOP-16063 is about creating these 
> definitions for various
>     >     >         > hdfs/yarn use cases)
>     >     >         > 
>     >     >         > As of now we have dedicated branches in the hadoop git 
> repository for
>     >     >         > the docker images (docker-hadoop-runner, 
> docker-hadoop-2,
>     >     >         > docker-hadoop-3). It turns out that a separated 
> repository would be more
>     >     >         > effective as the dockerhub can use only full branch 
> names as tags.
>     >     >         > 
>     >     >         > We would like to provide ozone docker images to make 
> the evaluation as
>     >     >         > easy as 'docker run -d apache/hadoop-ozone:0.3.0', 
> therefore in HDDS-851
>     >     >         > we agreed to create a separated repository for the 
> hadoop-ozone docker
>     >     >         > images.
>     >     >         > 
>     >     >         > If this approach works well we can also move out the 
> existing
>     >     >         > docker-hadoop-2/docker-hadoop-3/docker-hadoop-runner 
> branches from
>     >     >         > hadoop.git to an other separated hadoop-docker.git 
> repository)
>     >     >         > 
>     >     >         > Please let me know if you have any comments,
>     >     >         > 
>     >     >         > Thanks,
>     >     >         > Marton
>     >     >         > 
>     >     >         > 1: see
>     >     >         > 
> https://github.com/flokkr/runtime-compose/tree/master/hdfs/routerfeder
>     >     >         > as an example
>     >     >         > 
>     >     >         > 
> ---------------------------------------------------------------------
>     >     >         > To unsubscribe, e-mail: 
> hdfs-dev-unsubscr...@hadoop.apache.org
>     >     >         > For additional commands, e-mail: 
> hdfs-dev-h...@hadoop.apache.org
>     >     >         > 
>     >     >         
>     >     >         
>     >     >         
> ---------------------------------------------------------------------
>     >     >         To unsubscribe, e-mail: 
> hdfs-dev-unsubscr...@hadoop.apache.org
>     >     >         For additional commands, e-mail: 
> hdfs-dev-h...@hadoop.apache.org
>     >     >         
>     >     >         
>     >     >     
>     >     >     
>     >     >     
> ---------------------------------------------------------------------
>     >     >     To unsubscribe, e-mail: 
> common-dev-unsubscr...@hadoop.apache.org
>     >     >     For additional commands, e-mail: 
> common-dev-h...@hadoop.apache.org
>     >     >     
>     >     > 
>     >     
>     >     
>     > 
>     > 
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>     > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>     > 
>     
>     
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to