[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017427#comment-16017427 ] Elek, Marton commented on HADOOP-13397: --- I am interested about the docker images as I plan to create additional getting started tutorials based on docker images. I tested mkhdf and it worked well. I also have experiences woth my own docker images: I am running hadoop/spark/hbase and other clusters with docker images where every service is in a separated container. (see http://github.com/elek/bigdata-docker/ if you interested) I suggest to split this jira as (as I see) there are two parts: 1. one side is the role of mkhdf: which could create a selfcontained customized Dockerfile according to the parameters 2. I think, it's a separated task to create (or generate with mkhdf) one exact Dockerfile, commit it to a new branch in the hadoop git repository and ask INFRA to register new branch to the dockerhub. My proposal to the second one is here: https://github.com/elek/hadoop/tree/docker-2.8.0 The example to use is here: https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml As you can see everything could be configured with environment variables, thx to a simple script which converts the environment variables to hadoop xml (and other property) format. I would be happy to contribute this type of configuration loading to the mkhdf as a separated module. But as I wrote, I think it two things and with creating two separated jira, I think we can create apache/hadoop images even without blocking on the mkhdf script. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742863#comment-15742863 ] Allen Wittenauer commented on HADOOP-13397: --- Just an update. HADOOP-13673 is nearly completion. After it gets committed, it'll be trivial to run multiple daemons as multiple users in a single docker image for those releases that have this patch. This greatly simplifies any start up code needed under docker, as the su'ing is handled by Apache Hadoop itself. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565125#comment-15565125 ] Tsuyoshi Ozawa commented on HADOOP-13397: - > The OS image contains minimally glibc and various other GPL components. In my understanding, glibc and GPL components are distributed via another tar ball(base image). If we create docker image for Hadoop, base image including GPL files seems to be not included. I played with golang:1.6-onbuild image to confirm with this script: https://raw.githubusercontent.com/docker/docker/master/contrib/download-frozen-image-v2.sh : {quote} $ ./download-frozen-image-v2.sh /tmp/b golang:1.6-onbuild Downloading 'library/golang:1.6-onbuild@1.6-onbuild' (20 layers)... \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% \\ 100.0% Download of images into '/tmp/b' complete. $ cd /tmp/b/870716423ca9b2b721debfc28c19e54b50245839fef2f364d38d3452f77515ed $ tar xvf layer.tar x usr/ x usr/local/ x usr/local/bin/ x usr/local/bin/go-wrapper {quote} If we push docker image, it only includes diff: this is what I mentioned. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1470#comment-1470 ] Allen Wittenauer commented on HADOOP-13397: --- bq. It means our docker image only includes ASF-licensed binaries That's not true. The OS image contains minimally the Linux kernel which is GPL. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554268#comment-15554268 ] Tsuyoshi Ozawa commented on HADOOP-13397: - [~aw] I got a comment in LEGAL-270: {quote} All official Apache releases are in source form; Docker images can be built from our official source releases, but are derived distributions analogous to, say, Debian packages. {quote} Binary distribution itself seems to be fine as you know. In addition to that, I checked the behaviour of Docker. it seems that Docker command try to download binary images described in "FROM" state in parallel. It means our docker image only includes ASF-licensed binaries, if we only includes compiled binary which passed release voting. Do you have any concern? > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523478#comment-15523478 ] Allen Wittenauer commented on HADOOP-13397: --- Just an update... I've been working out starting individual daemons in individual containers as well as the difficulties of what it means to actually use a single node docker container cluster. It finally hit me the other day that I was effectively rewriting the sbin/start-all.sh script. So rather than build Yet Another version of that, I've been hacking on it to make it smarter. For example, if the script is being run as root, check to see if yarn and hdfs users are defined. If so, then launch start-hdfs and start-yarn as those users. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514586#comment-15514586 ] Dima Spivak commented on HADOOP-13397: -- This is an important point, [~ozawa]. On the HBase side, we should be okay with legal because we don't distribute any images we create; they are used purely in our automation and never pushed to a user-accessible Registry. Definitely something to watch out for, though. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513860#comment-15513860 ] Tsuyoshi Ozawa commented on HADOOP-13397: - {quote} a) I and I know others as well have some rather large licensing questions around Docker images. They effectively act as a binary distribution and it is very much against ASF rules to distribute GPL and other Category X components. It makes me extremely uncomfortable to move forward without some clarification from legal. (Yes, I know other ASF projects are publishing images on docker hub. Hopefully that means that there is a JIRA issue in the LEGAL project to point to.) This is a blocking issue that really needs to get clarified before further time investment. {quote} [~aw] [~dimaspivak] FYI, opened LEGAL-270. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15462556#comment-15462556 ] Klaus Ma commented on HADOOP-13397: --- Any progress? > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413987#comment-15413987 ] Allen Wittenauer commented on HADOOP-13397: --- bq. I guess my concern then is creep of what belongs under Hadoop vs. Bigtop or some other project if we start talking about packaging. As a separate project, Bigtop can do what Bigtop wants. They could tie into this infrastructure to build Dockerfiles that contain all of their re-packaging if they want, but I doubt that will happen. bq. I think this might end up not satisfying the kind of use cases people who see "Ooh, Hadoop in Docker is up on GitHub!" might be looking for. That might just be the case. But my hunch is that it will: the folks who are going to respond that way are likely already building their own Hadoop rather than getting whatever vendors or Bigtop feed them. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413896#comment-15413896 ] Dima Spivak commented on HADOOP-13397: -- I guess my concern then is creep of what belongs under Hadoop vs. Bigtop or some other project if we start talking about packaging. I totally appreciate where this idea comes from, since it has been so vendor-centric to this point, but I think this might end up not satisfying the kind of use cases people who see "Ooh, Hadoop in Docker is up on GitHub!" might be looking for. Just my two cents. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413802#comment-15413802 ] Allen Wittenauer commented on HADOOP-13397: --- Everything built here is replaceable though. For example, one group I talked to wants to configure with RPM and Chef. So they would just replace the distro stub with RPMs and the config stub with something Chef-based. The daemon can also get replaced to load the appropriate daemon in the container. Of course, that begs the question: what do they get out of this? Well, two things: * the community can set the default OS, Java, etc requirements in the OS chunk * a standard command that provides that information in a consumable form for Docker > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413736#comment-15413736 ] Dima Spivak commented on HADOOP-13397: -- Gotcha. I guess I just question that last use case because, unlike the micro-services that find themselves at home with something like Kubernetes or Docker Swarm, an all-in-one Hadoop image requires in-container configuration that seems to lends itself to more orchestration than those tools provide. I spent weeks playing with Kubernetes, Swarm, and Compose before recognizing that it just doesn't work well with how Hadoop handles configuration, but I trust your judgement. :) > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412801#comment-15412801 ] Allen Wittenauer commented on HADOOP-13397: --- A few things: * need to be able to replace how Hadoop gets installed: tar ball vs rpm vs deb vs * need to be able to replace how Hadoop gets configured: single node vs multi-node vs. config management vs ... * need a raw dockerfile because they want their pre-existing frameworks to manage the installation (e.g., Kubernetes mentioned above) That last one is key: there are plenty of other tools already in existence that can manage containers (even Jenkins can do it!). There's no real value in re-inventing that for a large segment of users. HBASE-12721 looks great if one doesn't have anything else to manage or just wants a quick up 'n' down for something like testing. But that's not the reality for a large chunk of docker-ized environments. They want to plug the file in and go. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412778#comment-15412778 ] Dima Spivak commented on HADOOP-13397: -- Seems like a lot of work, [~aw], compared to something that plugs into what we have over in HBASE-12721. What exactly is the concern with using it? There are no weird dependencies; a single user could independently build and deploy a cluster on any machine. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > Attachments: HADOOP-13397.DNC001.patch > > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396691#comment-15396691 ] Klaus Ma commented on HADOOP-13397: --- [~aw], thanks for your feedback. That's great if we have such image :). > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396023#comment-15396023 ] Allen Wittenauer commented on HADOOP-13397: --- I had a long discussion yesterday with some folks about this topic, especially around how to make it something consumable for a wide variety of installation-types. One of the big asks was to make it work as a self-contained Dockerfile (so no COPY/ADD, RUNs can only reference things already inside the image, etc, etc.) as much as possible to allow the Dockerfile to be used by some other service and/or the basis of adding more content. This means if I'm using a configuration service such as bcfg2 or puppet, it would be able to put down the necessary components at docker build or docker run time. If I'm using something that takes a supplied tar ball, then COPY is unavoidable. [1] It also means it's not a "do everything imaginable" feature like HBASE-12721 . One would still need something to actually launch the containers and give some control information. I've been playing around a bit and have a simple prototype built based upon those discussions and what I've seen in Klaus' github repo. I'm going to clean it up and flesh it out a bit and probably post a patch in week or so, time permitting. (i.e., remove all the hard codes and start making it take options... haha.) [1] I mean, we *could* do something like base64 encode the tar ball and extract, but that seems a little extreme. ;) > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma >Assignee: Allen Wittenauer > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393459#comment-15393459 ] Klaus Ma commented on HADOOP-13397: --- Hi all, any suggestion to move forward ? > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388824#comment-15388824 ] Klaus Ma commented on HADOOP-13397: --- I think a dockerfile template is OK. We'd better to provide guidance to user how to run YARN in Docker. And as I mentioned in dev list, there's also some issue I'd like to confirm with you: in hdfs-site.xml, "dfs.namenode.datanode.registration.ip-hostname-checkā is false; but it seems the master will use host IP when connect back to datanode instead of container IP. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388406#comment-15388406 ] Dima Spivak commented on HADOOP-13397: -- Just as an FYI, we've got some work that's nearly done in HBASE-12721 that supports starting distributed HBase clusters on a single host using multiple Docker containers. We also get Hadoop working in this as a prerequisite to getting HBase up, so might be worth taking a look if we might be able to link up somewhere. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388249#comment-15388249 ] Timothy St. Clair commented on HADOOP-13397: I'm good with just "Dockerfile templates" in upstream, start simple and expand, from there downstream providers can work out their own logistics. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388200#comment-15388200 ] Allen Wittenauer commented on HADOOP-13397: --- A couple of things: a) I and I know others as well have some rather large licensing questions around Docker images. They effectively act as a binary distribution and it is very much against ASF rules to distribute GPL and other Category X components. It makes me extremely uncomfortable to move forward without some clarification from legal. (Yes, I know other ASF projects are publishing images on docker hub. Hopefully that means that there is a JIRA issue in the LEGAL project to point to.) This is a blocking issue that really needs to get clarified before further time investment. b) I'm going to change the description in this issue from "Official image from Cloudera" to "Cloudera's image". Cloudera can't make an "official image" for Apache Hadoop, so let's clear up any potential confusion before it starts. c) Is this actually useful in reality? The vast vast vast majority of Apache Hadoop deployments add a wide variety of additional components on top of Apache Hadoop to the point that even making a base image still seems like it wouldn't be particularly usable without downstream conflict resolution. It may be useful to make Dockerfile templates, but full blown images? Hmm.. I'm going to need some convincing. d) Upon working with the existing Dockerfile and porting it over to support the ASF PowerPC build machines (HADOOP-13329) we need to be aware that we're going to need more than one Dockerfile per hardware platform. We made that mistake with start-build-env.sh (which we'll fix as part of 13329), but we should avoid it here. (We've gotten some poking from the ARM64 folks as well.) e) This is going to hit upon the larger issue of distributed configuration management, which is going to be extremely tricky to make consumable, never mind what types of configurations are actually supported: security? persistent storage? Then there are client configs--which, it's worthwhile pointing out, not even the vendor tools handle particularly well. f) I think a much more attainable goal to start is making a single Dockerfile that runs all of the Apache Hadoop daemons as a single node configuration. That's a highly desirable thing to have for a variety of reasons. If there is still heavy interest in breaking it apart, it gives a base working example before proceeding further to tease out the various daemons. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Official image from Cloudera is the quickstart image: > https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387855#comment-15387855 ] Timothy St. Clair commented on HADOOP-13397: It would be ideal if apache/projects could publish artifacts, including Docker images, to hub as part of their release process. > Add dockerfile for Hadoop > - > > Key: HADOOP-13397 > URL: https://issues.apache.org/jira/browse/HADOOP-13397 > Project: Hadoop Common > Issue Type: Bug >Reporter: Klaus Ma > > For now, there's no community version Dockerfile in Hadoop; most of docker > images are provided by vendor, e.g. > 1. Official image from Cloudera is the quickstart image: > https://hub.docker.com/r/cloudera/quickstart/ > 2. From HortonWorks sequenceiq: > https://hub.docker.com/r/sequenceiq/hadoop-docker/ > 3. MapR provides the mapr-sandbox-base: > https://hub.docker.com/r/maprtech/mapr-sandbox-base/ > The proposal of this JIRA is to provide a community version Dockerfile in > Hadoop, and here's some requirement: > 1. Seperated docker image for master & agents, e.g. resource manager & node > manager > 2. Default configuration to start master & agent instead of configurating > manually > 3. Start Hadoop process as no-daemon > Here's my dockerfile to start master/agent: > https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn > I'd like to contribute it after polishing :). > Email Thread : > http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org