[jira] [Created] (HADOOP-16146) Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN

2019-02-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16146:
-

 Summary: Make start-build-env.sh safe in case of misusage of 
DOCKER_INTERACTIVE_RUN
 Key: HADOOP-16146
 URL: https://issues.apache.org/jira/browse/HADOOP-16146
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


[~aw] reported the problem in HDDS-891:
{quote}DOCKER_INTERACTIVE_RUN opens the door for users to set command line 
options to docker. Most notably, -c and -v and a few others that share one 
particular characteristic: they reference the file system. As soon as shell 
code hits the file system, it is no longer safe to assume space delimited 
options. In other words, -c /My Cool Filesystem/Docker Files/config.json or -v 
/c_drive/Program Files/Data:/data may be something a user wants to do, but the 
script now breaks because of the IFS assumptions.
{quote}
DOCKER_INTERACTIVE_RUN was used in jenkins to run normal build process in 
docker. In case of DOCKER_INTERACTIVE_RUN was set to empty the docker container 
is started without the "-i -t" flags.

It can be improved by checking the value of the environment variable and enable 
only fixed set of values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-03-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16183:
-

 Summary: Use latest Yetus to support ozone specific build process
 Key: HADOOP-16183
 URL: https://issues.apache.org/jira/browse/HADOOP-16183
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


In YETUS-816 the hadoop personality is improved to better support ozone 
specific changes.

Unfortunately the hadoop personality is part of the Yetus project and not the 
Hadoop project: we need a new yetus release or switch to an unreleased version.

In this patch I propose to use the latest commit from yetus (but use that fixed 
commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-05-02 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16183.
---
   Resolution: Fixed
Fix Version/s: 0.5.0

> Use latest Yetus to support ozone specific build process
> 
>
> Key: HADOOP-16183
> URL: https://issues.apache.org/jira/browse/HADOOP-16183
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.5.0
>
>
> In YETUS-816 the hadoop personality is improved to better support ozone 
> specific changes.
> Unfortunately the hadoop personality is part of the Yetus project and not the 
> Hadoop project: we need a new yetus release or switch to an unreleased 
> version.
> In this patch I propose to use the latest commit from yetus (but use that 
> fixed commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16312:
-

 Summary: Remove dumb-init from hadoop-runner image
 Key: HADOOP-16312
 URL: https://issues.apache.org/jira/browse/HADOOP-16312
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This issue is reported by [~eyang] in HDDS-1495.

I think it's better to discuss under a separated issue as it's unrelated to 
HDDS-1495.

The original problem description from [~eyang]
{quote}Dumb-init  is one way to always run contaized program in the background 
and respawn the program when program fails. This is poor man’s solution for 
keeping program alive.


Cluster management software like Kubernetes or YARN have additional policy and 
logic to start the same docker container on a different node. Therefore, 
Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
management software to make decision where to start the container. Dumb-init 
for demonize docker container will be removed, and change to use entrypoint.sh 
Docker provides -d flag to demonize foreground process. Most of the management 
system built on top of Docker, (ie. Kitematic, Apache YARN, and Kubernetes) 
integrates with Docker container at foreground to  aggregate stdout and stderr 
output of the containerized program.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16338) Use exec in hadoop-runner base image instead of fork

2019-05-30 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16338:
-

 Summary: Use exec in hadoop-runner base image instead of fork
 Key: HADOOP-16338
 URL: https://issues.apache.org/jira/browse/HADOOP-16338
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


[~eyang] Suggested in HADOOP-16312 to use exec instead of the default fork in 
the starter.sh of the hadoop-runner image (docker-hadoop-runer branch)

Instead of
{code}
"$@"
{code}

{code}
exec "$@"
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-30 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16312.
---
Resolution: Won't Fix

Thanks [~eyang]. As we concluded that we don't need to remove dumb-init I am 
closing this issue and created HADOOP-16338 to use exec instead of the default 
fork.

> Remove dumb-init from hadoop-runner image
> -
>
> Key: HADOOP-16312
> URL: https://issues.apache.org/jira/browse/HADOOP-16312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This issue is reported by [~eyang] in HDDS-1495.
> I think it's better to discuss under a separated issue as it's unrelated to 
> HDDS-1495.
> The original problem description from [~eyang]
> {quote}Dumb-init  is one way to always run contaized program in the 
> background and respawn the program when program fails. This is poor man’s 
> solution for keeping program alive.
> Cluster management software like Kubernetes or YARN have additional policy 
> and logic to start the same docker container on a different node. Therefore, 
> Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
> management software to make decision where to start the container. Dumb-init 
> for demonize docker container will be removed, and change to use 
> entrypoint.sh Docker provides -d flag to demonize foreground process. Most of 
> the management system built on top of Docker, (ie. Kitematic, Apache YARN, 
> and Kubernetes) integrates with Docker container at foreground to  aggregate 
> stdout and stderr output of the containerized program.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-06-03 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16092.
---
Resolution: Won't Fix

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14850) Read HttpServer2 resources directly from the source tree (if exists)

2017-09-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14850:
-

 Summary: Read HttpServer2 resources directly from the source tree 
(if exists)
 Key: HADOOP-14850
 URL: https://issues.apache.org/jira/browse/HADOOP-14850
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha4
Reporter: Elek, Marton
Assignee: Elek, Marton


Currently the Hadoop server components can't be started from IDE during the 
development. There are two reasons for that:

1. some artifacts are in provided scope which are definitelly needed to run the 
server (see HDFS-12197)

2. The src/main/webapp dir should be on the classpath (but not).

In this issue I suggest to fix the second issue by reading the web resources 
(html and css files) directly from the source tree and not from the classpath 
but ONLY if the src/main/webapp dir exists. Similar approach exists in 
different projects (eg. in Spark).

WIth this patch the web development of the web interfaces are significant 
easier as the result could be checked immediatelly with a running severt 
(without rebuild/restart). I used this patch during the development of the 
Ozone web interfaces.

As the original behaviour of the resource location has not been change if 
"src/main/webapp" doesn't exist, I think it's quite safe.  And the method is 
called only once during the creation of the HttpServer2 there is also no change 
in performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14898) Create official Docker images for development and testing features

2017-09-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14898:
-

 Summary: Create official Docker images for development and testing 
features 
 Key: HADOOP-14898
 URL: https://issues.apache.org/jira/browse/HADOOP-14898
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


This is the original mail from the mailing list:

{code}
TL;DR: I propose to create official hadoop images and upload them to the 
dockerhub.

GOAL/SCOPE: I would like improve the existing documentation with easy-to-use 
docker based recipes to start hadoop clusters with various configuration.

The images also could be used to test experimental features. For example ozone 
could be tested easily with these compose file and configuration:

https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6

Or even the configuration could be included in the compose file:

https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml

I would like to create separated example compose files for federation, ha, 
metrics usage, etc. to make it easier to try out and understand the features.

CONTEXT: There is an existing Jira 
https://issues.apache.org/jira/browse/HADOOP-13397
But it’s about a tool to generate production quality docker images (multiple 
types, in a flexible way). If no objections, I will create a separated issue to 
create simplified docker images for rapid prototyping and investigating new 
features. And register the branch to the dockerhub to create the images 
automatically.

MY BACKGROUND: I am working with docker based hadoop/spark clusters quite a 
while and run them succesfully in different environments (kubernetes, 
docker-swarm, nomad-based scheduling, etc.) My work is available from here: 
https://github.com/flokkr but they could handle more complex use cases (eg. 
instrumenting java processes with btrace, or read/reload configuration from 
consul).
 And IMHO in the official hadoop documentation it’s better to suggest to use 
official apache docker images and not external ones (which could be changed).
{code}

The next list will enumerate the key decision points regarding to docker image 
creating

A. automated dockerhub build  / jenkins build

Docker images could be built on the dockerhub (a branch pattern should be 
defined for a github repository and the location of the Docker files) or could 
be built on a CI server and pushed.

The second one is more flexible (it's more easy to create matrix build, for 
example)
The first one had the advantage that we can get an additional flag on the 
dockerhub that the build is automated (and built from the source by the 
dockerhub).

The decision is easy as ASF supports the first approach: (see 
https://issues.apache.org/jira/browse/INFRA-12781?focusedCommentId=15824096&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15824096)

B. source: binary distribution or source build

The second question is about creating the docker image. One option is to build 
the software on the fly during the creation of the docker image the other one 
is to use the binary releases.

I suggest to use the second approach as:

1. In that case the hadoop:2.7.3 could contain exactly the same hadoop 
distrubution as the downloadable one

2. We don't need to add development tools to the image, the image could be more 
smaller (which is important as the goal for this image to getting started as 
fast as possible)

3. The docker definition will be more simple (and more easy to maintain)

Usually this approach is used in other projects (I checked Apache Zeppelin and 
Apache Nutch)

C. branch usage

Other question is the location of the Docker file. It could be on the official 
source-code branches (branch-2, trunk, etc.) or we can create separated 
branches for the dockerhub (eg. docker/2.7 docker/2.8 docker/3.0)

For the first approach it's easier to find the docker images, but it's less 
flexible. For example if we had a Dockerfile for on the source code it should 
be used for every release (for example the Docker file from the tag 
release-3.0.0 should be used for the 3.0 hadoop docker image). In that case the 
release process is much more harder: in case of a Dockerfile error (which could 
be test on dockerhub only after the taging), a new release should be added 
after fixing the Dockerfile.

Another problem is that with using tags it's not possible to improve the 
Dockerfiles. I can imagine that we would like to improve for example the 
hadoop:2.7 images (for example adding more smart startup scripts) with using 
exactly the same hadoop 2.7 distribution. 

Finally with tag based approach we can't create images for the older releases 
(2.8.1 for example)

So I suggest to create separated branches for the Dockerfiles.

D. Versions

We can create a separated branch for every version (2.7.1/2.7.2/2.7.3) or just 
for the main ver

[jira] [Resolved] (HADOOP-14162) Improve release scripts to automate missing steps

2017-11-09 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-14162.
---
Resolution: Won't Fix

> Improve release scripts to automate missing steps
> -
>
> Key: HADOOP-14162
> URL: https://issues.apache.org/jira/browse/HADOOP-14162
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>
> According to the conversation on the dev mailing list one pain point of the 
> release making is that even with the latest create-release script a lot of 
> steps are not automated.
> This Jira is about creating a script which guides the release manager throw 
> the proces:
> Goals:
>   * It would work even without the apache infrastructure: with custom 
> configuration (forked repositories/alternative nexus), it would be possible 
> to test the scripts even by a non-commiter.  
>   * every step which could be automated should be scripted (create git 
> branches, build,...). if something could be not automated there an 
> explanation could be printed out, and wait for confirmation
>   * Before dangerous steps (eg. bulk jira update) we can ask for confirmation 
> and explain the 
>   * The run should be idempontent (and there should be an option to continue 
> the release from any steps).  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14160) Create dev-support scripts to do the bulk jira update required by the release process

2017-11-09 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-14160.
---
Resolution: Won't Fix

> Create dev-support scripts to do the bulk jira update required by the release 
> process
> -
>
> Key: HADOOP-14160
> URL: https://issues.apache.org/jira/browse/HADOOP-14160
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>
> According to the conversation on the dev mailing list one pain point of the 
> release making is the Jira administration.
> This issue is about creating new scripts to 
>  
>  * query apache issue about a possible release (remaining blocking, issues, 
> etc.)
>  * and do bulk changes (eg.  bump fixVersions)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15065) Make mapreduce specific GenericOptionsParser arguments optional

2017-11-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15065:
-

 Summary: Make mapreduce specific GenericOptionsParser arguments 
optional
 Key: HADOOP-15065
 URL: https://issues.apache.org/jira/browse/HADOOP-15065
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Priority: Minor


org.apache.hadoop.util.GenericOptionsParser is widely used to use common 
arguments in all the command line applications.

Some of the common arguments are really generic:

{code}
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
{code}

But some are mapreduce specific:

{code}
-jt   specify a ResourceManager
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines
{code}

In the review of HDFS-12588 it was suggested to remove/turn off the mapreduce 
specific arguments if they are not required (for example in case of starting 
namenode or datanode). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15084) Create docker images for latest stable hadoop2 build

2017-12-01 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15084:
-

 Summary: Create docker images for latest stable hadoop2 build
 Key: HADOOP-15084
 URL: https://issues.apache.org/jira/browse/HADOOP-15084
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15083) Create base image for running hadoop in docker containers

2017-12-01 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15083:
-

 Summary: Create base image for running hadoop in docker containers
 Key: HADOOP-15083
 URL: https://issues.apache.org/jira/browse/HADOOP-15083
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15122) Lock down version of doxia-module-markdown plugin

2017-12-15 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15122:
-

 Summary: Lock down version of doxia-module-markdown plugin
 Key: HADOOP-15122
 URL: https://issues.apache.org/jira/browse/HADOOP-15122
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Elek, Marton
Assignee: Elek, Marton


Since HADOOP-14364 we have a SNAPSHOT dependency in the main pom.xml:

{code}
+
+  org.apache.maven.doxia
+  doxia-module-markdown
+  1.8-SNAPSHOT
+
{code}

Most probably because some feature was missing from doxia markdown module.

I propose to lock down the version and use a fixed instance from the snapshot 
version. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15258) Create example docker-compse file for documentations

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15258:
-

 Summary: Create example docker-compse file for documentations
 Key: HADOOP-15258
 URL: https://issues.apache.org/jira/browse/HADOOP-15258
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


An other user case for docker is to use it in the documentation. For example in 
the HA documentation we can provide an example docker-compose file and 
configuration with all the required settings to getting started easily with an 
HA cluster.

1. I would add an example to a documetation page
2. It will use the hadoop3 image (which contains latest hadoop3) as the user of 
the documentation may not build a hadoop



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15259) Provide docker file for the development builds

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15259:
-

 Summary: Provide docker file for the development builds
 Key: HADOOP-15259
 URL: https://issues.apache.org/jira/browse/HADOOP-15259
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


An other use case for using docker image is creating custom docker image (base 
image + custom hadoop build). The custom image could be used to test easily the 
hadoop build on external dockerized cluster (eg. Kubernetes)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15256) Create docker images for latest stable hadoop3 build

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15256:
-

 Summary: Create docker images for latest stable hadoop3 build
 Key: HADOOP-15256
 URL: https://issues.apache.org/jira/browse/HADOOP-15256
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


Similar to the hadoop2 image we can provide a developer hadoop image which 
contains the latest hadoop from the binary release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15257) Provide example docker compose file for developer builds

2018-02-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15257:
-

 Summary: Provide example docker compose file for developer builds
 Key: HADOOP-15257
 URL: https://issues.apache.org/jira/browse/HADOOP-15257
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


This issue is about creating example docker-compose files which use the latest 
build from the hadoop-dist directory.

These docker-compose files would help to run a specific hadoop cluster based on 
the latest custom build without the need to build customized docker image (with 
mounting hadoop fro hadoop-dist to the container



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15302) Enable DataNode/NameNode service plugins with Service Provider interface

2018-03-09 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15302:
-

 Summary: Enable DataNode/NameNode service plugins with Service 
Provider interface
 Key: HADOOP-15302
 URL: https://issues.apache.org/jira/browse/HADOOP-15302
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


HADOOP-5257 introduced ServicePlugin capabilities for NameNode/DataNode. As of 
now they could be activated by configuration values. 

I propose to activate plugins with Service Provider Interface. In case of a 
special service file is added a jar it would be enough to add the plugin to the 
classpath. It would help to add optional components to NameNode/DataNode with 
settings the classpath.

This is the same api which could be used in java 9 to consume defined services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-03-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15339:
-

 Summary: Support additional key/value propereties in JMX bean 
registration
 Key: HADOOP-15339
 URL: https://issues.apache.org/jira/browse/HADOOP-15339
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
register objects to the JMX registry with a given name prefix and name.

JMX supports any additional key value pairs which could be part the the address 
of the jmx bean. For example: 
_java.lang:type=MemoryManager,name=CodeCacheManager_

Using this method we can query a group of mbeans, for example we can add the 
same tag to similar mbeans from namenode and datanode.

This patch adds a small modification to support custom key value pairs and also 
introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15340) Fix the RPC server name usage to provide information about the metrics

2018-03-23 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15340:
-

 Summary: Fix the RPC server name usage to provide information 
about the metrics
 Key: HADOOP-15340
 URL: https://issues.apache.org/jira/browse/HADOOP-15340
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 3.2.0
Reporter: Elek, Marton
Assignee: Elek, Marton


In case of multiple RPC servers in the same JVM it's hard to identify the 
metric data. The only available information as of now is the port number.

Server name is also added in the constructor of Server.java but it's not used 
at all.

This patch fix this behaviour:

 1. The server name is saved to a field in Server.java (constructor signature 
is not changed)
 2. ServerName is added as a tag to the metrics in RpcMetrics
 3. The naming convention for the severs are fix.

About 3: if the server name is not defined the current code tries to identify 
the name from the class name. Which is not always an easy task as in some cases 
the server has a protobuf generated dirty name which also could be an inner 
class.

The patch also improved the detection of the name (if it's not defined). It's a 
compatible change as the current name is not user ad all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15352) FIx default local maven repository path in create-release script

2018-03-29 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15352:
-

 Summary: FIx default local maven repository path in create-release 
script 
 Key: HADOOP-15352
 URL: https://issues.apache.org/jira/browse/HADOOP-15352
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.1.0
Reporter: Elek, Marton
Assignee: Elek, Marton


I am testing the create-release script locally. In case the MVNCACHE is not set 
the local ~/.m2 is used. Which is not good as the packages are downloaded to 
~/.m2/org/.../... instead of ~/.m2/repository/org/.../.../...





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15353) Bump default yetus version in the yetus-wrapper

2018-03-29 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15353:
-

 Summary: Bump default yetus version in the yetus-wrapper
 Key: HADOOP-15353
 URL: https://issues.apache.org/jira/browse/HADOOP-15353
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.1.0
Reporter: Elek, Marton
Assignee: Elek, Marton
 Attachments: HADOOP-15353.001.patch

The current precommit hook uses yetus 0.8.0-SNAPSHOT. The default version in 
the yetus-wrapper script is 0.4.0. It could be adjusted HADOOP_YETUS_VERSION 
but I suggest to set the default version to 0.7.0 to get results similar to the 
jenkins results locally without adjustments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15367) Update the initialization code in the docker hadoop-runner baseimage

2018-04-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15367:
-

 Summary: Update the initialization code in the docker 
hadoop-runner baseimage 
 Key: HADOOP-15367
 URL: https://issues.apache.org/jira/browse/HADOOP-15367
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton
Assignee: Elek, Marton


The hadoop-runner baseimage contains initialization code for both the HDFS 
namenode/datanode and Ozone/Hdds scm/ksm.

The script name is for the later one is changed (from oz to ozone) therefore we 
need to updated the base image.

This commit also would be a test for the dockerhub automated build.

Please apply the patch on the top of the _docker-hadoop-runner_ branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15369) Avoid usage of ${project.version} in parent pom

2018-04-06 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15369:
-

 Summary: Avoid usage of ${project.version} in parent pom
 Key: HADOOP-15369
 URL: https://issues.apache.org/jira/browse/HADOOP-15369
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.2.0
Reporter: Elek, Marton
Assignee: Elek, Marton


hadoop-project/pom.xml and hadoop-project-dist/pom.xml use _${project.version}_ 
variable in dependencyManagement and plugin dependencies.

Unfortunatelly it could not work if we use different version in a child project 
as ${project.version} variable is resolved *after* the inheritance.

>From [maven 
>doc|https://maven.apache.org/guides/introduction/introduction-to-the-pom.html#Project_Inheritance]:

{quote}
For example, to access the project.version variable, you would reference it 
like so:

  ${project.version}

One factor to note is that these variables are processed after inheritance as 
outlined above. This means that if a parent project uses a variable, then its 
definition in the child, not the parent, will be the one eventually used.
{quote}

The community voted to keep ozone in-tree but use a different release cycle. To 
achieve this we need different version for selected subproject therefor we 
can't use ${project.version} any more. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15539) Make start-build-env.sh usable in non-interactive mode

2018-06-14 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15539:
-

 Summary: Make start-build-env.sh usable in non-interactive mode
 Key: HADOOP-15539
 URL: https://issues.apache.org/jira/browse/HADOOP-15539
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


The current start-build-env.sh in the project root is useful to start a new 
build environment. But it's not possible to start the build environment and run 
the command in one step.

We use the dockerized build environment on jenkins 
(https://builds.apache.org/job/Hadoop-trunk-ozone-acceptance/) which requires a 
small modification to optionally run start-build-env.sh in non-interactive mode 
and execute any command in the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15656) Support byteman in hadoop-runner baseimage

2018-08-09 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15656:
-

 Summary: Support byteman in hadoop-runner baseimage
 Key: HADOOP-15656
 URL: https://issues.apache.org/jira/browse/HADOOP-15656
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


[Byteman|http://byteman.jboss.org/] is an easy to use tool to instrument a java 
process with agent string.

For example [this 
script|https://gist.githubusercontent.com/elek/0589a91b4d55afb228279f6c4f04a525/raw/8bb4e03de7397c8a9d9bb74a5ec80028b42575c4/hadoop.btm]
 defines a rule to print out all the hadoop rpc traffic to the standard output 
(which is extremely useful for testing development).

This patch adds the byteman.jar to the baseimage and defines a simple logic to 
add agent instrumentation string to the HADOOP_OPTS (optional it also could 
download the byteman script from an url)





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15673) Hadoop:3 image is missing from dockerhub

2018-08-15 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15673:
-

 Summary: Hadoop:3 image is missing from dockerhub
 Key: HADOOP-15673
 URL: https://issues.apache.org/jira/browse/HADOOP-15673
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Elek, Marton


Currently the apache/hadoop:3 image is missing from the dockerhub as the 
Dockerfile in docker-hadoop-3 branch contains the outdated 3.0.0 download url. 
It should be updated to the latest 3.1.1 url.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15730) Add Ozone submodule to the hadoop.apache.org

2018-09-07 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15730:
-

 Summary: Add Ozone submodule to the hadoop.apache.org
 Key: HADOOP-15730
 URL: https://issues.apache.org/jira/browse/HADOOP-15730
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


The current hadoop.apache.org doesn't mention Ozone in the "Modules" section.

We can add something like this (or better):

{quote}
Hadoop Ozone is an object store for Hadoop on top of the Hadoop HDDS which 
provides low-level binary storage layer.
{quote}

We can also linke to the http://ozone.hadoop.apache.org




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15791:
-

 Summary: Remove Ozone related sources from the 3.2 branch
 Key: HADOOP-15791
 URL: https://issues.apache.org/jira/browse/HADOOP-15791
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


As it is discussed at HDDS-341 and written in the original proposal of Ozone 
merge, we can remove all the ozone/hdds projects from the 3.2 release branch.

{quote}
 * On trunk (as opposed to release branches) HDSL will be a separate module in 
Hadoop's source tree. This will enable the HDSL to work on their trunk and the 
Hadoop trunk without making releases for every change.
  * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
  * When Hadoop creates a release branch, the RM will delete the HDSL module 
from the branch.
{quote}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15857:
-

 Summary: Remove ozonefs class name definition from core-default.xml
 Key: HADOOP-15857
 URL: https://issues.apache.org/jira/browse/HADOOP-15857
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Elek, Marton
Assignee: Elek, Marton


Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
branch-3.2 still contains a reference with o3://.

The easiest way to fix it just remove the fs.o3.imp definition from 
core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reopened HADOOP-15339:
---

Since the commit we use this change from ozone/hdds and it worked well.

This change is required to have a working ozone/hdds webui as the shared code 
path tags the common jmx beans with generic key/value tags.

I reopen this issue and propose to backport it to branch-3.1 to make it easier 
to use hdds/ozone with older hadoop versions.
 # It's a small change
 # Backward compatible
 # Safe to use (no issue during the last 6 months)
 # No conflicts for cherry-pick.

 

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339.001.patch, HADOOP-15339.002.patch, 
> HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16003:
-

 Summary: Migrate the Hadoop jenkins jobs to use new gitbox urls
 Key: HADOOP-16003
 URL: https://issues.apache.org/jira/browse/HADOOP-16003
 Project: Hadoop Common
  Issue Type: Task
Reporter: Elek, Marton


As it's announced the INFRA team all the apache git repositories will be 
migrated to use gitbox. I created this jira to sync on the required steps to 
update the jenkins job, and record the changes.

By default it could be as simple as changing the git url for all the jenkins 
jobs under the Hadoop view:

https://builds.apache.org/view/H-L/view/Hadoop/




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16063) Docker based pseudo-cluster definitions and test scripts for Hdfs/Yarn

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16063:
-

 Summary: Docker based pseudo-cluster definitions and test scripts 
for Hdfs/Yarn
 Key: HADOOP-16063
 URL: https://issues.apache.org/jira/browse/HADOOP-16063
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Elek, Marton


During the recent releases of Apache Hadoop Ozone we had multiple experiments 
using docker/docker-compose to support the development of ozone.

As of now the hadoop-ozone distribution contains two directories in additional 
the regular hadoop directories (bin, share/lib, etc
h3. compose

The ./compose directory of the distribution contains different type of 
pseudo-cluster definitions. To start an ozone cluster is as easy as "cd 
compose/ozone && docker-compose up-d"

The clusters also could be scaled up and down (docker-compose scale datanode=3)

There are multiple cluster definitions for different use cases (for example 
ozone+s3 or hdfs+ozone).

The docker-compose files are based on apache/hadoop-runner image which is an 
"empty" image. It doesnt' contain any hadoop distribution. Instead the current 
hadoop is used (the ../.. is mapped as a volume at /opt/hadoop)

With this approach it's very easy to 1) start a cluster from the distribution 
2) test any patch from the dev tree, as after any build a new cluster can be 
started easily (with multiple nodes and datanodes)
h3. smoketest

We also started to use a simple robotframework based test suite. (see 
./smoketest directory). It's a high level test definition very similar to the 
smoketests which are executed manually by the contributors during a release 
vote.

But it's a formal definition to start cluster from different docker-compose 
definitions and execute simple shell scripts (and compare the output).

 

I believe that both approaches helped a lot during the development of ozone and 
I propose to do the same improvements on the main hadoop distribution.

I propose to provide docker-compose based example cluster definitions for 
yarn/hdfs and for different use cases (simple hdfs, router based federation, 
etc.)

It can help to understand the different configuration and try out new features 
with predefined config set.

Long term we can also add robottests to help the release votes (basic 
wordcount/mr tests could be scripted)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16064:
-

 Summary: Load configuration values from external sources
 Key: HADOOP-16064
 URL: https://issues.apache.org/jira/browse/HADOOP-16064
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This is a proposal to improve the Configuration.java to load configuration from 
external sources (kubernetes config map, external http reqeust, any cluster 
manager like ambari, etc.)

I will attach a patch to illustrate the proposed solution, but please comment 
the concept first, the patch is just poc and not fully implemented.

*Goals:*
 * **Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
external locations instead of the classpath (classpath remains the default)
 * Make the configuration loading extensible
 * Make it in an backward-compatible way with minimal change in the existing 
Configuration.java

*Use-cases:*

 1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
this approach only the namenode should be configured, other components require 
only the url of the namenode

 2.) Read configuration directly from kubernetes config-map (or mesos)

 3.) Read configuration from any external cluster management (such as Apache 
Ambari or any equivalent)

 4.) as of now in the hadoop docker images we transform environment variables 
(such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
of a python script. With the proposed implementation it would be possible to 
read the configuration directly from the system environment variables.

*Problem:*

The existing Configuration.java can read configuration from multiple sources. 
But most of the time it's used to load predefined config names ("core-site.xml" 
and "hdfs-site.xml") without configuration location. In this case the files 
will be loaded from the classpath.

I propose to add additional option to define the default location of 
core-site.xml and hdfs-site.xml (any configuration which is defined by string 
name) to use external sources in the classpath.

The configuration loading requires implementation + configuration (where are 
the external configs). We can't use regular configuration to configure the 
config loader (chicken/egg).

I propose to use a new environment variable HADOOP_CONF_SOURCE

The environment variable could contain a URL, where the schema of the url can 
define the config source and all the other parts can configure the access to 
the resource.

Examples:

HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]

HADOOP_CONF_SOURCE=env://prefix

HADOOP_CONF_SOURCE=k8s://config-map-name

The ConfigurationSource interface can be as easy as:
{code:java}
/**
 * Interface to load hadoop configuration from custom location.
 */
public interface ConfigurationSource {

  /**
   * Method will be called one with the defined configuration url.
   *
   * @param uri
   */
  void initialize(URI uri) throws IOException;

  /**
   * Method will be called to load a specific configuration resource.
   *
   * @param name of the configuration resource (eg. hdfs-site.xml)
   * @return List of loaded configuraiton key and values.
   */
  List readConfiguration(String name);

}{code}
We can choose the right implementation based the schema of the uri and with 
Java Service Provider Interface mechanism 
(META-INF/services/org.apache.hadoop.conf.ConfigurationSource)

It could be with minimal modification in the Configuration.java (see the 
attached patch as an example)

 The patch contains two example implementation:

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*

This can load configuration from environment variables based on a naming 
convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*

 This implementation can load the configuration from a /conf servlet of any 
Hadoop components.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-02-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16091:
-

 Summary: Create hadoop/ozone docker images with inline build 
process
 Key: HADOOP-16091
 URL: https://issues.apache.org/jira/browse/HADOOP-16091
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Elek, Marton


This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.

bq. 1, 3. There are 38 Apache projects hosting docker images on Docker hub 
using Apache Organization.  By browsing Apache github mirror.  There are only 7 
projects using a separate repository for docker image build.  Popular projects 
official images are not from Apache organization, such as zookeeper, tomcat, 
httpd.  We may not disrupt what other Apache projects are doing, but it looks 
like inline build process is widely employed by majority of projects such as 
Nifi, Brooklyn, thrift, karaf, syncope and others.  The situation seems a bit 
chaotic for Apache as a whole.  However, Hadoop community can decide what is 
best for Hadoop.  My preference is to remove ozone from source tree naming, if 
Ozone is intended to be subproject of Hadoop for long period of time.  This 
enables Hadoop community to host docker images for various subproject without 
having to check out several source tree to trigger a grand build.  However, 
inline build process seems more popular than separated process.  Hence, I 
highly recommend making docker build inline if possible.

The main challenges are also discussed in the thread:

{code}
3. Technically it would be possible to add the Dockerfile to the source
tree and publish the docker image together with the release by the
release manager but it's also problematic:

{code}
  a) there is no easy way to stage the images for the vote
  c) it couldn't be flagged as automated on dockerhub
  d) It couldn't support the critical updates.


 * Updating existing images (for example in case of an ssl bug, rebuild
all the existing images with exactly the same payload but updated base
image/os environment)

 * Creating image for older releases (We would like to provide images,
for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
with different versions).


{code}

The a) can be solved (as [~eyang] suggested) with using a personal docker image 
during the vote and publish it to the dockerhub after the vote (in case the 
permission can be set by the INFRA)

Note: based on LEGAL-270 and linked discussion both approaches (inline build 
process / external build process) are compatible with the apache release.

Note: HDDS-851 and HADOOP-14898 contains more information about these problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-02-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16092:
-

 Summary: Move the source of hadoop/ozone containers to the same 
repository
 Key: HADOOP-16092
 URL: https://issues.apache.org/jira/browse/HADOOP-16092
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.

bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
remove ozone from source tree naming, if Ozone is intended to be subproject of 
Hadoop for long period of time.  This enables Hadoop community to host docker 
images for various subproject without having to check out several source tree 
to trigger a grand build

As of now the source of  hadoop docker images are stored in the hadoop git 
repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
repository for ozone (all branches).

As it's discussed in HDDS-851 the biggest challenge to solve here is the 
mapping between git branches and dockerhub tags. It's not possible to use the 
captured part of a github branch.

For example it's not possible to define a rule to build all the ozone-(.*) 
branches and use a tag $1 for it. Without this support we need to create a new 
mapping for all the releases manually (with the help of the INFRA).

Note: HADOOP-16091 can solve this problem as it doesn't require branch mapping 
any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14160) Create dev-support scripts to do the bulk jira update required by the release process

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14160:
-

 Summary: Create dev-support scripts to do the bulk jira update 
required by the release process
 Key: HADOOP-14160
 URL: https://issues.apache.org/jira/browse/HADOOP-14160
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


According to the conversation on the dev mailing list one pain point of the 
release making is the Jira administration.

This issue is about creating new scripts to 
 
 * query apache issue about a possible release (remaining blocking, issues, 
etc.)
 * and do bulk changes (eg.  bump fixVersions)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14162) Improve release scripts to automate missing steps

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14162:
-

 Summary: Improve release scripts to automate missing steps
 Key: HADOOP-14162
 URL: https://issues.apache.org/jira/browse/HADOOP-14162
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Elek, Marton
Assignee: Elek, Marton


According to the conversation on the dev mailing list one pain point of the 
release making is that even with the latest create-release script a lot of 
steps are not automated.

This Jira is about creating a script which guides the release manager throw the 
proces:

Goals:
  * It would work even without the apache infrastructure: with custom 
configuration (forked repositories/alternative nexus), it would be possible to 
test the scripts even by a non-commiter.  
  * every step which could be automated should be scripted (create git 
branches, build,...). if something could be not automated there an explanation 
could be printed out, and wait for confirmation
  * Before dangerous steps (eg. bulk jira update) we can ask for confirmation 
and explain the 
  * The run should be idempontent (and there should be an option to continue 
the release from any steps).  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14163:
-

 Summary: Refactor existing hadoop site to use more usable static 
website generator
 Key: HADOOP-14163
 URL: https://issues.apache.org/jira/browse/HADOOP-14163
 Project: Hadoop Common
  Issue Type: Improvement
  Components: site
Reporter: Elek, Marton
Assignee: Elek, Marton


>From the dev mailing list:

"Publishing can be attacked via a mix of scripting and revamping the darned 
website. Forrest is pretty bad compared to the newer static site generators out 
there (e.g. need to write XML instead of markdown, it's hard to review a 
staging site because of all the absolute links, hard to customize, did I 
mention XML?), and the look and feel of the site is from the 00s. We don't 
actually have that much site content, so it should be possible to migrate to a 
new system."

This issue is find a solution to migrate the old site to a new modern static 
site generator using a more contemprary theme.

Goals: 
 * existing links should work (or at least redirected)
 * It should be easy to add more content required by a release automatically 
(most probably with creating separated markdown files)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14164) Update the skin of maven-site during doc generation

2017-03-08 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-14164:
-

 Summary: Update the skin of maven-site during doc generation
 Key: HADOOP-14164
 URL: https://issues.apache.org/jira/browse/HADOOP-14164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Elek, Marton
Assignee: Elek, Marton


Together with the improvements of the hadoop site (HADOOP-14163), I suggest to 
improve theme used by the mave-site plugin for all the hadoop documentation.

One possible option is using the reflow skin:

http://andriusvelykis.github.io/reflow-maven-skin/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org