[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388200#comment-15388200
 ] 

Allen Wittenauer commented on HADOOP-13397:
-------------------------------------------

A couple of things:

a) I and I know others as well have some rather large licensing questions 
around Docker images.  They effectively act as a binary distribution and it is 
very much against ASF rules to distribute GPL and other Category X components.  
 It makes me extremely uncomfortable to move forward without some clarification 
from legal.  (Yes, I know other ASF projects are publishing images on docker 
hub.  Hopefully that means that there is a JIRA issue in the LEGAL project to 
point to.)  This is a blocking issue that really needs to get clarified before 
further time investment.

b) I'm going to change the description in this issue from "Official image from 
Cloudera" to "Cloudera's image".  Cloudera can't make an "official image" for 
Apache Hadoop, so let's clear up any potential confusion before it starts.

c) Is this actually useful in reality?  The vast vast vast majority of Apache 
Hadoop deployments add a wide variety of additional components on top of Apache 
Hadoop to the point that even making a base image still seems like it wouldn't 
be particularly usable without downstream conflict resolution. It may be useful 
to make Dockerfile templates, but full blown images? Hmm.. I'm going to need 
some convincing.

d) Upon working with the existing Dockerfile and porting it over to support the 
ASF PowerPC build machines (HADOOP-13329) we need to be aware that we're going 
to need more than one Dockerfile per hardware platform.  We made that mistake 
with start-build-env.sh (which we'll fix as part of 13329), but we should avoid 
it here.   (We've gotten some poking from the ARM64 folks as well.)

e) This is going to hit upon the larger issue of distributed configuration 
management, which is going to be extremely tricky to make consumable, never 
mind what types of configurations are actually supported: security? persistent 
storage? Then there are client configs--which, it's worthwhile pointing out, 
not even the vendor tools handle particularly well.

f) I think a much more attainable goal to start is making a single Dockerfile 
that runs all of the Apache Hadoop daemons as a single node configuration. 
That's a highly desirable thing to have for a variety of reasons.  If there is 
still heavy interest in breaking it apart, it gives a base working example 
before proceeding further to tease out the various daemons.

> Add dockerfile for Hadoop
> -------------------------
>
>                 Key: HADOOP-13397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13397
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Official image from Cloudera is the quickstart image: 
> https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to