[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2017-05-19 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017427#comment-16017427
 ] 

Elek, Marton commented on HADOOP-13397:
---

I am interested about the docker images as I plan to create additional getting 
started tutorials based on docker images.

I tested mkhdf and it worked well. I also have experiences woth my own docker 
images: I am running hadoop/spark/hbase and other clusters with docker images 
where every service is in a separated container. (see 
http://github.com/elek/bigdata-docker/ if you interested)

I suggest to split this jira as (as I see) there are two parts:

1. one side is the role of mkhdf: which could create a selfcontained customized 
Dockerfile according to the parameters

2. I think, it's a separated task to create (or generate with mkhdf) one exact 
Dockerfile, commit it to a new branch in the hadoop git repository and ask 
INFRA to register new branch to the dockerhub.

My proposal to the second one is here: 
https://github.com/elek/hadoop/tree/docker-2.8.0

The example to use is here:

https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml

As you can see everything could be configured with environment variables, thx 
to a simple script which converts the environment variables to hadoop xml (and 
other property) format.

I would be happy to contribute this type of configuration loading to the mkhdf 
as a separated module. But as I wrote, I think it two things and with creating 
two separated jira, I think we can create apache/hadoop images even without 
blocking on the mkhdf script.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-12-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742863#comment-15742863
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

Just an update.  HADOOP-13673 is nearly completion. After it gets committed, 
it'll be trivial to run multiple daemons as multiple users in a single docker 
image for those releases that have this patch.  This greatly simplifies any 
start up code needed under docker, as the su'ing is handled by Apache Hadoop 
itself.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-10-11 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565125#comment-15565125
 ] 

Tsuyoshi Ozawa commented on HADOOP-13397:
-

> The OS image contains minimally glibc and various other GPL components.

In my understanding, glibc and GPL components are distributed via another tar 
ball(base image). If we create docker image for Hadoop, base image including 
GPL files seems to be not included.
I played with golang:1.6-onbuild image to confirm with this script: 
https://raw.githubusercontent.com/docker/docker/master/contrib/download-frozen-image-v2.sh
 :
{quote}
$ ./download-frozen-image-v2.sh /tmp/b golang:1.6-onbuild 
Downloading 'library/golang:1.6-onbuild@1.6-onbuild' (20 layers)...
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%
\\ 
100.0%

Download of images into '/tmp/b' complete.

$ cd /tmp/b/870716423ca9b2b721debfc28c19e54b50245839fef2f364d38d3452f77515ed
$ tar xvf layer.tar 
x usr/
x usr/local/
x usr/local/bin/
x usr/local/bin/go-wrapper
{quote}

If we push docker image, it only includes diff: this is what I mentioned.



> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-10-07 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1470#comment-1470
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

bq. It means our docker image only includes ASF-licensed binaries

That's not true.  The OS image contains minimally the Linux kernel which is GPL.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-10-07 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554268#comment-15554268
 ] 

Tsuyoshi Ozawa commented on HADOOP-13397:
-

[~aw] I got a comment in LEGAL-270:
{quote}
All official Apache releases are in source form; Docker images can be built 
from our official source releases, but are derived distributions analogous to, 
say, Debian packages.
{quote}

Binary distribution itself seems to be fine as you know. In addition to that, I 
checked the behaviour of Docker. it seems that Docker command try to download 
binary images described in "FROM" state in parallel. It means our docker image 
only includes ASF-licensed binaries, if we only includes compiled binary which 
passed release voting. Do you have any concern?

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-09-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523478#comment-15523478
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

Just an update...

I've been working out starting individual daemons in individual containers as 
well as the difficulties of what it means to actually use a single node docker 
container cluster.  It finally hit me the other day that I was effectively 
rewriting the sbin/start-all.sh script. So rather than build Yet Another 
version of that, I've been hacking on it to make it smarter.  For example, if 
the script is being run as root, check to see if yarn and hdfs users are 
defined.  If so, then launch start-hdfs and start-yarn as those users.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-09-22 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514586#comment-15514586
 ] 

Dima Spivak commented on HADOOP-13397:
--

This is an important point, [~ozawa]. On the HBase side, we should be okay with 
legal because we don't distribute any images we create; they are used purely in 
our automation and never pushed to a user-accessible Registry. Definitely 
something to watch out for, though.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-09-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513860#comment-15513860
 ] 

Tsuyoshi Ozawa commented on HADOOP-13397:
-

{quote}
a) I and I know others as well have some rather large licensing questions 
around Docker images. They effectively act as a binary distribution and it is 
very much against ASF rules to distribute GPL and other Category X components. 
It makes me extremely uncomfortable to move forward without some clarification 
from legal. (Yes, I know other ASF projects are publishing images on docker 
hub. Hopefully that means that there is a JIRA issue in the LEGAL project to 
point to.) This is a blocking issue that really needs to get clarified before 
further time investment.
{quote}

[~aw] [~dimaspivak] FYI, opened LEGAL-270.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-09-04 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15462556#comment-15462556
 ] 

Klaus Ma commented on HADOOP-13397:
---

Any progress?

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413987#comment-15413987
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

bq. I guess my concern then is creep of what belongs under Hadoop vs. Bigtop or 
some other project if we start talking about packaging. 

As a separate project, Bigtop can do what Bigtop wants. They could tie into 
this infrastructure to build Dockerfiles that contain all of their re-packaging 
if they want, but I doubt that will happen.

bq. I think this might end up not satisfying the kind of use cases people who 
see "Ooh, Hadoop in Docker is up on GitHub!" might be looking for. 

That might just be the case.  But my hunch is that it will:  the folks who are 
going to respond that way are likely already building their own Hadoop rather 
than getting whatever vendors or Bigtop feed them.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-09 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413896#comment-15413896
 ] 

Dima Spivak commented on HADOOP-13397:
--

I guess my concern then is creep of what belongs under Hadoop vs. Bigtop or 
some other project if we start talking about packaging. I totally appreciate 
where this idea comes from, since it has been so vendor-centric to this point, 
but I think this might end up not satisfying the kind of use cases people who 
see "Ooh, Hadoop in Docker is up on GitHub!" might be looking for. Just my two 
cents.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413802#comment-15413802
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

Everything built here is replaceable though.  For example, one group I talked 
to wants to configure with RPM and Chef.  So they would just replace the distro 
stub with RPMs and the config stub with something Chef-based.  The daemon can 
also get replaced to load the appropriate daemon in the container.  

Of course, that begs the question: what do they get out of this?  Well, two 
things:

* the community can set the default OS, Java, etc requirements in the OS chunk
* a standard command that provides that information in a consumable form for 
Docker

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-09 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413736#comment-15413736
 ] 

Dima Spivak commented on HADOOP-13397:
--

Gotcha. I guess I just question that last use case because, unlike the 
micro-services that find themselves at home with something like Kubernetes or 
Docker Swarm, an all-in-one Hadoop image requires in-container configuration 
that seems to lends itself to more orchestration than those tools provide. I 
spent weeks playing with Kubernetes, Swarm, and Compose before recognizing that 
it just doesn't work well with how Hadoop handles configuration, but I trust 
your judgement. :)

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-08 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412801#comment-15412801
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

A few things:
* need to be able to replace how Hadoop gets installed:  tar ball vs rpm vs deb 
vs 
* need to be able to replace how Hadoop gets configured: single node vs 
multi-node vs. config management vs ...
* need a raw dockerfile because they want their pre-existing frameworks to 
manage the installation (e.g., Kubernetes mentioned above)

That last one is key: there are plenty of other tools already in existence that 
can manage containers (even Jenkins can do it!).  There's no real value in 
re-inventing that for a large segment of users.

HBASE-12721 looks great if one doesn't have anything else to manage or just 
wants a quick up 'n' down for something like testing.  But that's not the 
reality for a large chunk of docker-ized environments.  They want to plug the 
file in and go.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-08-08 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412778#comment-15412778
 ] 

Dima Spivak commented on HADOOP-13397:
--

Seems like a lot of work, [~aw], compared to something that plugs into what we 
have over in HBASE-12721. What exactly is the concern with using it? There are 
no weird dependencies; a single user could independently build and deploy a 
cluster on any machine.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-27 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396691#comment-15396691
 ] 

Klaus Ma commented on HADOOP-13397:
---

[~aw], thanks for your feedback. That's great if we have such image :).

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396023#comment-15396023
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

I had a long discussion yesterday with some folks about this topic, especially 
around how to make it something consumable for a wide variety of 
installation-types.  One of the big asks was to make it work as a 
self-contained Dockerfile (so no COPY/ADD, RUNs can only reference things 
already inside the image, etc, etc.) as much as possible to allow the 
Dockerfile to be used by some other service and/or the basis of adding more 
content.  This means if I'm using a configuration service such as bcfg2 or 
puppet, it would be able to put down the necessary components at docker build 
or docker run time.  If I'm using something that takes a supplied tar ball, 
then COPY is unavoidable. [1]  It also means it's not a "do everything 
imaginable" feature like HBASE-12721 .  One would still need something to 
actually launch the containers and give some control information.

I've been playing around a bit and have a simple prototype built based upon 
those discussions and what I've seen in Klaus' github repo. I'm going to clean 
it up and flesh it out a bit and probably post a patch in week or so, time 
permitting. (i.e., remove all the hard codes and start making it take 
options... haha.)

[1] I mean, we *could* do something like base64 encode the tar ball and 
extract, but that seems a little extreme. ;)

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Allen Wittenauer
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-26 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393459#comment-15393459
 ] 

Klaus Ma commented on HADOOP-13397:
---

Hi all, any suggestion to move forward ?

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-21 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388824#comment-15388824
 ] 

Klaus Ma commented on HADOOP-13397:
---

I think  a dockerfile template is OK. We'd better to provide guidance to user 
how to run YARN in Docker.

And as I mentioned in dev list, there's also some issue I'd like to confirm 
with you: in hdfs-site.xml, 
"dfs.namenode.datanode.registration.ip-hostname-checkā€ is false; but it seems 
the master will
use host IP when connect back to datanode instead of container IP.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-21 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388406#comment-15388406
 ] 

Dima Spivak commented on HADOOP-13397:
--

Just as an FYI, we've got some work that's nearly done in HBASE-12721 that 
supports starting distributed HBase clusters on a single host using multiple 
Docker containers. We also get Hadoop working in this as a prerequisite to 
getting HBase up, so might be worth taking a look if we might be able to link 
up somewhere.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-21 Thread Timothy St. Clair (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388249#comment-15388249
 ] 

Timothy St. Clair commented on HADOOP-13397:


I'm good with just "Dockerfile templates" in upstream, start simple and expand, 
from there downstream providers can work out their own logistics.  

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388200#comment-15388200
 ] 

Allen Wittenauer commented on HADOOP-13397:
---

A couple of things:

a) I and I know others as well have some rather large licensing questions 
around Docker images.  They effectively act as a binary distribution and it is 
very much against ASF rules to distribute GPL and other Category X components.  
 It makes me extremely uncomfortable to move forward without some clarification 
from legal.  (Yes, I know other ASF projects are publishing images on docker 
hub.  Hopefully that means that there is a JIRA issue in the LEGAL project to 
point to.)  This is a blocking issue that really needs to get clarified before 
further time investment.

b) I'm going to change the description in this issue from "Official image from 
Cloudera" to "Cloudera's image".  Cloudera can't make an "official image" for 
Apache Hadoop, so let's clear up any potential confusion before it starts.

c) Is this actually useful in reality?  The vast vast vast majority of Apache 
Hadoop deployments add a wide variety of additional components on top of Apache 
Hadoop to the point that even making a base image still seems like it wouldn't 
be particularly usable without downstream conflict resolution. It may be useful 
to make Dockerfile templates, but full blown images? Hmm.. I'm going to need 
some convincing.

d) Upon working with the existing Dockerfile and porting it over to support the 
ASF PowerPC build machines (HADOOP-13329) we need to be aware that we're going 
to need more than one Dockerfile per hardware platform.  We made that mistake 
with start-build-env.sh (which we'll fix as part of 13329), but we should avoid 
it here.   (We've gotten some poking from the ARM64 folks as well.)

e) This is going to hit upon the larger issue of distributed configuration 
management, which is going to be extremely tricky to make consumable, never 
mind what types of configurations are actually supported: security? persistent 
storage? Then there are client configs--which, it's worthwhile pointing out, 
not even the vendor tools handle particularly well.

f) I think a much more attainable goal to start is making a single Dockerfile 
that runs all of the Apache Hadoop daemons as a single node configuration. 
That's a highly desirable thing to have for a variety of reasons.  If there is 
still heavy interest in breaking it apart, it gives a base working example 
before proceeding further to tease out the various daemons.

> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Official image from Cloudera is the quickstart image: 
> https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13397) Add dockerfile for Hadoop

2016-07-21 Thread Timothy St. Clair (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387855#comment-15387855
 ] 

Timothy St. Clair commented on HADOOP-13397:


It would be ideal if apache/projects could publish artifacts, including Docker 
images, to hub as part of their release process. 



> Add dockerfile for Hadoop
> -
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> For now, there's no community version Dockerfile in Hadoop; most of docker 
> images are provided by vendor, e.g. 
> 1. Official image from Cloudera is the quickstart image: 
> https://hub.docker.com/r/cloudera/quickstart/
> 2.  From HortonWorks sequenceiq: 
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base: 
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in 
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node 
> manager
> 2. Default configuration to start master & agent instead of configurating 
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent: 
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread : 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org