[jira] [Updated] (HADOOP-16372) Fix typo in DFSUtil getHttpPolicy method

2019-06-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16372:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Fix typo in DFSUtil getHttpPolicy method
> 
>
> Key: HADOOP-16372
> URL: https://issues.apache.org/jira/browse/HADOOP-16372
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Dinesh Chitlangia
>Priority: Trivial
>
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java#L1479]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16372) Fix typo in DFSUtil getHttpPolicy method

2019-06-14 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16372:
--
Status: Patch Available  (was: Open)

> Fix typo in DFSUtil getHttpPolicy method
> 
>
> Key: HADOOP-16372
> URL: https://issues.apache.org/jira/browse/HADOOP-16372
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Dinesh Chitlangia
>Priority: Trivial
>
> [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java#L1479]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-06-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1681#comment-1681
 ] 

Elek, Marton commented on HADOOP-16092:
---

Thanks @eric the answer. I am open to any suggestion. 

For the records: hadoop-runner is created to support all of the subprojects 
(hdfs, mapreduce, yarn, ozone). The idea was discussed on the the dev mailing 
lists (lazy consensus) and received positive feedback, the design doc (in a 
presentation format) was uploaded to the related JIRA (HADOOP-14898), a youtube 
video is created to explain the design doc (!) and the key decisions.

(And we also asked a few developers for feedback, offline) 

I wouldn't say that it was developed outside of the community, but I am open to 
any suggestion to improve similar process in the future.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-06-03 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854446#comment-16854446
 ] 

Elek, Marton commented on HADOOP-16092:
---

BTW based on your feedback, I will create an ozone specific runner image 
(ozone-runner) to separate all the ozone specific fragments from the common 
utilities. 

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-06-03 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16092.
---
Resolution: Won't Fix

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-06-03 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854441#comment-16854441
 ] 

Elek, Marton commented on HADOOP-16092:
---

Thank you very much the answer [~eyang]. This issue is created to move some 
sources from one branch to an other git repository, but the discussion has been 
changed and If I understood well it's not about this jira any more but the 
release process of ozone. 

I am closing this issue because it seems we can't find any consensus. I am very 
sorry but I don't think how is it possible. It seems that we can't agree even 
on the fundamental questions if the Ozone is a Hadoop subproject or not. (I 
think it is).

bq. Convenience binary for hadoop-runner:latest does not have the same version 
number for Ozone 0.4.0 which is not compliant to Apache release policy. 

Our binary image is apache/ozone:0.4.0. It has the same version number and in 
fact this is just a downstream distribution of the already created convenience 
binary.

It seems that you know about multiple problems related to the release process 
but not related to this hira. Would you be so kind to create separated issue 
for them to get more focused discussion (I suggest to use HDDS project). Thank 
you, in advance. 

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-05-30 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851614#comment-16851614
 ] 

Elek, Marton commented on HADOOP-16092:
---

Thanks to explain [~eyang] your opinion. I am sorry to say but I have slightly 
different view.

bq.  Hadoop-runner image is a distraction that serves no purpose to Hadoop 
development.

I am afraid that is not quite true. It's heavily used by Ozone which is a 
subproject of Hadoop.

bq. Fictional use case of having ability to run Hadoop 2.7.3 with latest 
hadoop-runner image can not be supported in reality.

Sorry, but I wouldn't say fictional. I think it's a very valid use case to test 
ozonefs with older released version (eg. for HDDS-1525).

bq.  A new user may come in and said that he likes the hard coded uid, and 
current hadoop-runner:latest breaks his backward compatibility

I am not sure If I understand this well. No uid has been changed in 
hadoop-runner recently. Only the directory of configs/logs are changed and I 
think it's still backward compatible with 0.3.0/0.4.0.

But I agree with you that it can be better to create more fixed tags from 
hadoop-runner. This issue is about moving hadoop-runner to a separated 
repository where we can easily create multiple tags with different 
branches/tags if the right mapping is registered to the dockerhub.

bq. By bringing hadoop-runner docker build process into regular Hadoop 
maintenance branches

We can't do this without refactoring the current way to use docker images in 
ozone development. This is part of a separated, open discussion.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Gabor Bota
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-30 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16312.
---
Resolution: Won't Fix

Thanks [~eyang]. As we concluded that we don't need to remove dumb-init I am 
closing this issue and created HADOOP-16338 to use exec instead of the default 
fork.

> Remove dumb-init from hadoop-runner image
> -
>
> Key: HADOOP-16312
> URL: https://issues.apache.org/jira/browse/HADOOP-16312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This issue is reported by [~eyang] in HDDS-1495.
> I think it's better to discuss under a separated issue as it's unrelated to 
> HDDS-1495.
> The original problem description from [~eyang]
> {quote}Dumb-init  is one way to always run contaized program in the 
> background and respawn the program when program fails. This is poor man’s 
> solution for keeping program alive.
> Cluster management software like Kubernetes or YARN have additional policy 
> and logic to start the same docker container on a different node. Therefore, 
> Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
> management software to make decision where to start the container. Dumb-init 
> for demonize docker container will be removed, and change to use 
> entrypoint.sh Docker provides -d flag to demonize foreground process. Most of 
> the management system built on top of Docker, (ie. Kitematic, Apache YARN, 
> and Kubernetes) integrates with Docker container at foreground to  aggregate 
> stdout and stderr output of the containerized program.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16338) Use exec in hadoop-runner base image instead of fork

2019-05-30 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16338:
-

 Summary: Use exec in hadoop-runner base image instead of fork
 Key: HADOOP-16338
 URL: https://issues.apache.org/jira/browse/HADOOP-16338
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


[~eyang] Suggested in HADOOP-16312 to use exec instead of the default fork in 
the starter.sh of the hadoop-runner image (docker-hadoop-runer branch)

Instead of
{code}
"$@"
{code}

{code}
exec "$@"
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-05-28 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849387#comment-16849387
 ] 

Elek, Marton commented on HADOOP-16092:
---

bq. The current user experience has been a very poor one. There are a lot of 
people using Hadoop images on Docker Hub. Some images has over 5 million use, 
but very few people use Hadoop-runner image

Please don't mix the two things. Usability and popularity are two different 
things. 

bq.  This should be enough to point out the current model is a non-starter for 
most people.

No. This shows the *popularity* of our *developer* image. 

bq.  Binary must be in docker image for production usage, which has been 
documented in docker-compose website.

100% agree:

*  apache/hadoop-runner --> this is for development and POC. This is just a 
*base* image. No ozone/hadoop inside
* apache/hadoop --> this is the real product image. Yes, it contains hadoop 
binary.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Gabor Bota
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-27 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848669#comment-16848669
 ] 

Elek, Marton commented on HADOOP-16312:
---

Thank you to explain it [~eyang]

bq. Dumb-init only works one way to push container into background execution

Can you please explain it in more details? I think dumb-init executes 
subprocesses in the foreground but I may be wrong.

bq. Bash between dumb-init and java will absorb all signal communications 
between dumb-init and java.

Are you sure? Do you have any method to prove it? According to my tests 
dumb-init signals all the child processes in the hierarchy.

bq. it is better to start the execution in foreground. 

Fix me if I am wrong but this is exactly what we are doing.

> Remove dumb-init from hadoop-runner image
> -
>
> Key: HADOOP-16312
> URL: https://issues.apache.org/jira/browse/HADOOP-16312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This issue is reported by [~eyang] in HDDS-1495.
> I think it's better to discuss under a separated issue as it's unrelated to 
> HDDS-1495.
> The original problem description from [~eyang]
> {quote}Dumb-init  is one way to always run contaized program in the 
> background and respawn the program when program fails. This is poor man’s 
> solution for keeping program alive.
> Cluster management software like Kubernetes or YARN have additional policy 
> and logic to start the same docker container on a different node. Therefore, 
> Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
> management software to make decision where to start the container. Dumb-init 
> for demonize docker container will be removed, and change to use 
> entrypoint.sh Docker provides -d flag to demonize foreground process. Most of 
> the management system built on top of Docker, (ie. Kitematic, Apache YARN, 
> and Kubernetes) integrates with Docker container at foreground to  aggregate 
> stdout and stderr output of the containerized program.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-05-27 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848668#comment-16848668
 ] 

Elek, Marton commented on HADOOP-16092:
---

Thank you very much the answer [~eyang]. 

1. Yes, I would like to create a convenience packaging to the already released 
hadoop 2.7.3 It doesn't involve any build step, this is just about to package 
the existing voted and released artifacts. No custom build is here just 
packaging (this is somewhat similar to create a deb package by bigtop)

2. I think you may misunderstand my lines. Imagine that we create ozone:0.3.0 
based on centos image. Centos base image may be updated in case of a security 
issue. In this case I suggest to update the centos but not the Ozone in 
ozone:0.3.0 image.

3. AFAIK the official release artifact is the source distribution. Can't see 
any problem if convenience packaging is created after the announcement. You can 
also check the 0.3.0. It also worked well (without your complain). Or is it a 
problem to publish the staged maven repository after the announcement? 

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Gabor Bota
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-17 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842393#comment-16842393
 ] 

Elek, Marton commented on HADOOP-16312:
---

Thanks to explain it [~eyang]

bq. The process tree will look like /dumb-init  java

Do you suggest to keep dumb-init and use exec?

> Remove dumb-init from hadoop-runner image
> -
>
> Key: HADOOP-16312
> URL: https://issues.apache.org/jira/browse/HADOOP-16312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This issue is reported by [~eyang] in HDDS-1495.
> I think it's better to discuss under a separated issue as it's unrelated to 
> HDDS-1495.
> The original problem description from [~eyang]
> {quote}Dumb-init  is one way to always run contaized program in the 
> background and respawn the program when program fails. This is poor man’s 
> solution for keeping program alive.
> Cluster management software like Kubernetes or YARN have additional policy 
> and logic to start the same docker container on a different node. Therefore, 
> Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
> management software to make decision where to start the container. Dumb-init 
> for demonize docker container will be removed, and change to use 
> entrypoint.sh Docker provides -d flag to demonize foreground process. Most of 
> the management system built on top of Docker, (ie. Kitematic, Apache YARN, 
> and Kubernetes) integrates with Docker container at foreground to  aggregate 
> stdout and stderr output of the containerized program.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-05-17 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842385#comment-16842385
 ] 

Elek, Marton commented on HADOOP-16092:
---

bq. Elek, Marton does this address your concerns?

Thank you the question. It doesn't address my concerns. 

 * The release process of older versions (eg. create hadoop 2.7.3 image is not 
addressed).
 * The option to update the underlying operating system of the containers are 
not addressed.

bq. As I predicted, the separate source code repository release model does not 
work. We did not build and release hadoop-docker-ozone for Ozone 0.4.0 release

It worked well. You can test it:

{code}
docker run -p 9878:9878 apache/ozone:0.4.0
{code}

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Gabor Bota
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-05-14 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839240#comment-16839240
 ] 

Elek, Marton commented on HADOOP-16092:
---

I am confused a little as this issue is suggested by you originally but If I 
understood well you don't like it any more.

bq. Any concerns [~elek]?

Thank you very much your question [~eyang]. Yes, I have concerns and wrote down 
the concerns on  the mailing list 

https://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201903.mbox/%3CF26192BF-45F3-4A64-9758-F4BD3EA6FDB4%40hortonworks.com%3E

and I mentioned multiple times at HDDS-1495 and HDDS-1458.

If they are not clear I am happy to repeat them again and I will try to make it 
more clear but I prefer to continue the discussion there to make it easier to 
follow.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Gabor Bota
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-13 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838462#comment-16838462
 ] 

Elek, Marton commented on HADOOP-16312:
---

First of all, let me clarify that we are talking about this project: 
[https://github.com/Yelp/dumb-init]
{quote}Dumb-init is one way to always run contaized program in the background 
and respawn the program when program fails
{quote}
Would you be so kind [~eyang] to explain it more details? Can you please give 
me an example how scm process is _respawned_ in the ozone compose clusters 
(where hadoop-runner is used together with dumb-init).

I ask it only because I have slightly different view of dumb-init (which may be 
wrong).

AFAIK dumb-init solves the signal handling problem and not the respawn problem. 
Without dumb-init the containers can't be stopped gracefully. processes will be 
killed by docker daemon after 10 seconds with "kill -9".

> Remove dumb-init from hadoop-runner image
> -
>
> Key: HADOOP-16312
> URL: https://issues.apache.org/jira/browse/HADOOP-16312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This issue is reported by [~eyang] in HDDS-1495.
> I think it's better to discuss under a separated issue as it's unrelated to 
> HDDS-1495.
> The original problem description from [~eyang]
> {quote}Dumb-init  is one way to always run contaized program in the 
> background and respawn the program when program fails. This is poor man’s 
> solution for keeping program alive.
> Cluster management software like Kubernetes or YARN have additional policy 
> and logic to start the same docker container on a different node. Therefore, 
> Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
> management software to make decision where to start the container. Dumb-init 
> for demonize docker container will be removed, and change to use 
> entrypoint.sh Docker provides -d flag to demonize foreground process. Most of 
> the management system built on top of Docker, (ie. Kitematic, Apache YARN, 
> and Kubernetes) integrates with Docker container at foreground to  aggregate 
> stdout and stderr output of the containerized program.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16312) Remove dumb-init from hadoop-runner image

2019-05-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16312:
-

 Summary: Remove dumb-init from hadoop-runner image
 Key: HADOOP-16312
 URL: https://issues.apache.org/jira/browse/HADOOP-16312
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This issue is reported by [~eyang] in HDDS-1495.

I think it's better to discuss under a separated issue as it's unrelated to 
HDDS-1495.

The original problem description from [~eyang]
{quote}Dumb-init  is one way to always run contaized program in the background 
and respawn the program when program fails. This is poor man’s solution for 
keeping program alive.


Cluster management software like Kubernetes or YARN have additional policy and 
logic to start the same docker container on a different node. Therefore, 
Dumb-init is not recommended for future Hadoop daemons instead allow cluster 
management software to make decision where to start the container. Dumb-init 
for demonize docker container will be removed, and change to use entrypoint.sh 
Docker provides -d flag to demonize foreground process. Most of the management 
system built on top of Docker, (ie. Kitematic, Apache YARN, and Kubernetes) 
integrates with Docker container at foreground to  aggregate stdout and stderr 
output of the containerized program.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16302) Fix typo on Hadoop Site Help dropdown menu

2019-05-13 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838456#comment-16838456
 ] 

Elek, Marton commented on HADOOP-16302:
---

Oh, I just realized that Ajay regenerated the site during the build of Ozone. 
Everything is fine without doing anything more.

Anyway, thanks the fix [~dineshchitlangia] (I am just wondering how a 
Sponsorshop looks like ;) )

> Fix typo on Hadoop Site Help dropdown menu
> --
>
> Key: HADOOP-16302
> URL: https://issues.apache.org/jira/browse/HADOOP-16302
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Affects Versions: asf-site
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Minor
> Attachments: Screen Shot 2019-05-07 at 11.57.01 PM.png
>
>
> On hadoop.apache.org the Help tab on top menu bar has Sponsorship spelt as 
> Sponsorshop.
> This jira aims to fix this typo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16302) Fix typo on Hadoop Site Help dropdown menu

2019-05-13 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838414#comment-16838414
 ] 

Elek, Marton commented on HADOOP-16302:
---

Yes, you should commit both the source change and the rendered (created by 
hugo) version. There is no automatic build as of now.

I will take care of this.

> Fix typo on Hadoop Site Help dropdown menu
> --
>
> Key: HADOOP-16302
> URL: https://issues.apache.org/jira/browse/HADOOP-16302
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Affects Versions: asf-site
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Minor
> Attachments: Screen Shot 2019-05-07 at 11.57.01 PM.png
>
>
> On hadoop.apache.org the Help tab on top menu bar has Sponsorship spelt as 
> Sponsorshop.
> This jira aims to fix this typo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-05-06 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833654#comment-16833654
 ] 

Elek, Marton edited comment on HADOOP-16091 at 5/6/19 9:41 AM:
---

bq. This is how maven is designed to allow each sub-module to build 
independently. This allows reducing iteration time on each component instead of 
doing the full build each time. The k8s-dev solution has a conflict of 
interests in maven design. Part of Maven design is to release one binary per 
project using maven release:release plugin.

I wouldn't like to create an additional thread here, but I think 
release:release goal is not part of the fundamental design of maven. This is 
just a maven plugin which can be replaced better release plugin or other 
processes. (I would say that life-cycle/goal bindings or profiles are part of 
the design)

BTW I think the "release:release" plugin has some design problems, but it's an 
other story (But I prefer to not use it, for example because the created tags 
are pushed too early).

bq. By using the tar layout stitching temp space, it saves space during the 
build. However, it creates a inseparable process for building tarball and 
docker in maven because the temp directory is not in maven cache. This means 
the tarball and docker image must be built together, and only one of them can 
be deposited to maven repository. Hence, it takes more time to reiterate just 
the docker part. It is not good for developer that only work on docker and not 
the tarball. 

I am not sure if I understood your concern but I think tar file creation and 
docker file creation can be easily separated by moving the tar file creation to 
the dist profile and keep k8s-dev profile as is. I am +1 for this suggestion.

Do you have any other problem with the k8s-dev approach?

bq. Symlink can be used to make /opt/ozone > /opt/ozone-${project.version}. 
This is the practice that Hadoop use to avoid versioned directory while 
maintain ability to swap binaries. We should keep symlink practice for config 
files to reference version neutral location. I think we have agreement on the 
base image. This also allow us to use RUN directive to make any post tarball 
process required in docker build.

Let me more precious: Hadoop doesn't use symlinks AFAIK. Ambari, bigtop and 
Hortonworks/Cloudera distributions use symlinks to manage multiple versions of 
hadoop.

Sorry if it seems to be pedant. I learned from [Eugenia 
Cheng|http://eugeniacheng.com/math/books/] that the difference between pedantry 
and precision is illumination. I wrote it just because I think it's very 
important that the symlinks are introduced to manage version in *on-prem* 
clusters.

I think the containerized word is different. For multiple versions we need to 
use different containers therefore we don't need to add version *inside* the 
containers any more.

Usually I don't think the examples are good arguments (I think it's more 
important to find the right solution instead of following existing practices) 
but I checked spark images (which can be created by bin/docker-image-tool.sh 
from the spark distribution) and they also use /opt/spark. (But I a fine to use 
/opt/apache/ozone if you prefer it. I like the apache subdir.)


was (Author: elek):

bq. This is how maven is designed to allow each sub-module to build 
independently. This allows reducing iteration time on each component instead of 
doing the full build each time. The k8s-dev solution has a conflict of 
interests in maven design. Part of Maven design is to release one binary per 
project using maven release:release plugin.

I wouldn't like to create an additional thread here, but I think 
release:release goal is not part of the fundamental design of maven. This is 
just a maven plugin which can be replaced better release plugin or other 
processes. (I would say that life-cycle/goal bindings or profiles are part of 
the design)

BTW I think the "release:release" plugin has some design problems, but it's an 
other story (But I prefer to not use it, for example because the created tags 
are pushed too early).

bq. By using the tar layout stitching temp space, it saves space during the 
build. However, it creates a inseparable process for building tarball and 
docker in maven because the temp directory is not in maven cache. This means 
the tarball and docker image must be built together, and only one of them can 
be deposited to maven repository. Hence, it takes more time to reiterate just 
the docker part. It is not good for developer that only work on docker and not 
the tarball. 

I am not sure if I understood your concern but I think tar file creation and 
docker file creation can be easily separated by moving the tar file creation to 
the dist profile and keep k8s-dev profile as is. I am +1 for this suggestion.

Do you have any other problem with the k8s-dev approach?

bq. Symlink 

[jira] [Commented] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-05-06 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833654#comment-16833654
 ] 

Elek, Marton commented on HADOOP-16091:
---


bq. This is how maven is designed to allow each sub-module to build 
independently. This allows reducing iteration time on each component instead of 
doing the full build each time. The k8s-dev solution has a conflict of 
interests in maven design. Part of Maven design is to release one binary per 
project using maven release:release plugin.

I wouldn't like to create an additional thread here, but I think 
release:release goal is not part of the fundamental design of maven. This is 
just a maven plugin which can be replaced better release plugin or other 
processes. (I would say that life-cycle/goal bindings or profiles are part of 
the design)

BTW I think the "release:release" plugin has some design problems, but it's an 
other story (But I prefer to not use it, for example because the created tags 
are pushed too early).

bq. By using the tar layout stitching temp space, it saves space during the 
build. However, it creates a inseparable process for building tarball and 
docker in maven because the temp directory is not in maven cache. This means 
the tarball and docker image must be built together, and only one of them can 
be deposited to maven repository. Hence, it takes more time to reiterate just 
the docker part. It is not good for developer that only work on docker and not 
the tarball. 

I am not sure if I understood your concern but I think tar file creation and 
docker file creation can be easily separated by moving the tar file creation to 
the dist profile and keep k8s-dev profile as is. I am +1 for this suggestion.

Do you have any other problem with the k8s-dev approach?

bq. Symlink can be used to make /opt/ozone > /opt/ozone-${project.version}. 
This is the practice that Hadoop use to avoid versioned directory while 
maintain ability to swap binaries. We should keep symlink practice for config 
files to reference version neutral location. I think we have agreement on the 
base image. This also allow us to use RUN directive to make any post tarball 
process required in docker build.

Let me more precious: Hadoop doesn't use symlinks AFAIK. Ambari, bigtop and 
Hortonworks/Cloudera distributions use symlinks to manage multiple versions of 
hadoop.

Sorry if it seems to be pedant. I learned from Eugenia Cheng that the 
difference between pedantry and precision is illumination. I wrote it just 
because I think it's very important that the symlinks are introduced to manage 
version in *on-prem* clusters.

I think the containerized word is different. For multiple versions we need to 
use different containers therefore we don't need to add version *inside* the 
containers any more.

Usually I don't think the examples are good arguments (I think it's more 
important to find the right solution instead of following existing practices) 
but I checked spark images (which can be created by bin/docker-image-tool.sh 
from the spark distribution) and they also use /opt/spark. (But I a fine to use 
/opt/apache/ozone if you prefer it. I like the apache subdir.)

> Create hadoop/ozone docker images with inline build process
> ---
>
> Key: HADOOP-16091
> URL: https://issues.apache.org/jira/browse/HADOOP-16091
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Elek, Marton
>Assignee: Eric Yang
>Priority: Major
> Attachments: HADOOP-16091.001.patch, HADOOP-16091.002.patch
>
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making 

[jira] [Commented] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-05-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833011#comment-16833011
 ] 

Elek, Marton commented on HADOOP-16091:
---

Thank you very much to upload this patch [~eyang]. It's always easier to 
discuss about real code.

1. First of all, please move this to a HDDS jira. It seems that the issue is 
changed to change HDDS and I think it's better to handle it under the HDDS 
project.

2. I am +1 to use assembly plugin instead of the shell based tar.

3. I have some problems with the inline docker image creation. 

# The discussion is started in the mailing list (and I don't think it's 
finished, my latest concerns are written here: 
https://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201903.mbox/%3C5bfeb864-3f26-1ccc-3300-2680e1b94f34%40apache.org%3E)
# It's also under discussion in HDDS-1458 and I don't think that we have 
consensus. 
 
I would prefer to agree in the approach first and discuss it at one location to 
make it easier to follow for everybody.

4. It seems that we have exactly the same functionality in hadoop-ozone/dist 
k8s-dev and k8s-dev-push profiles. Do you have any suggestion how the 
duplication can be handled?

AFAIK in this approach (Fix me please, if I am wrong):

 * We create a final ozone folder
 * We create a tar file (~360MB)
 * We copy it to the local maven repository (~360MB) 
 * We copy it to the docker/target directory (~360MB)
 * We create the docker image (~360MB)

In the k8s-dev profile:

 * We create a final ozone folder
 * We create the docker image (~360MB)

But the results are the same. It seems to be more effective for me and the tar 
step can be optional. And for a normal build we don't need it at all just for 
kubernetes deployment.

bq. /opt/ozone-${project.version}

Did you execute the smoketest? Did you test it from hive/spark? I am afraid 
that we will have some problems when we do testing with ozonefs/spark/hive as 
we need to know the exact location of the jar file. I think it's better to have 
the location version independent, but it's not a strong preference.

bq. Another notable problem is the hadoop-runner image is built with Squash and 
symlinks are not supported, and move of directory location is also not 
supported during build process. It is probably better to pick centos as base 
image to avoid those limitations with squashfs based image.

Can you please give me more information as I don't understand what is the 
problem exactly. Where do we need symlinks.

Independent from the answer I have no problem with using centos. I believe that 
we use centos even now:

{code}
docker run apache/hadoop-runner cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core) 
{code}


> Create hadoop/ozone docker images with inline build process
> ---
>
> Key: HADOOP-16091
> URL: https://issues.apache.org/jira/browse/HADOOP-16091
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Elek, Marton
>Assignee: Eric Yang
>Priority: Major
> Attachments: HADOOP-16091.001.patch, HADOOP-16091.002.patch
>
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making docker build inline if possible.
> {quote}
> The main challenges are also discussed in the thread:
> {code:java}
> 3. Technically it would be possible to add the Dockerfile to the source
> tree and publish the docker image together with the release by the
> release manager but it's also problematic:
> {code}
> a) there is no easy way to stage the images for the vote
>  c) it couldn't be flagged as automated on dockerhub
>  d) It couldn't support the critical 

[jira] [Resolved] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-05-02 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HADOOP-16183.
---
   Resolution: Fixed
Fix Version/s: 0.5.0

> Use latest Yetus to support ozone specific build process
> 
>
> Key: HADOOP-16183
> URL: https://issues.apache.org/jira/browse/HADOOP-16183
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.5.0
>
>
> In YETUS-816 the hadoop personality is improved to better support ozone 
> specific changes.
> Unfortunately the hadoop personality is part of the Yetus project and not the 
> Hadoop project: we need a new yetus release or switch to an unreleased 
> version.
> In this patch I propose to use the latest commit from yetus (but use that 
> fixed commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-04-26 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826800#comment-16826800
 ] 

Elek, Marton commented on HADOOP-16183:
---

Patch is updated to use the latest 0.10.0 release.

> Use latest Yetus to support ozone specific build process
> 
>
> Key: HADOOP-16183
> URL: https://issues.apache.org/jira/browse/HADOOP-16183
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> In YETUS-816 the hadoop personality is improved to better support ozone 
> specific changes.
> Unfortunately the hadoop personality is part of the Yetus project and not the 
> Hadoop project: we need a new yetus release or switch to an unreleased 
> version.
> In this patch I propose to use the latest commit from yetus (but use that 
> fixed commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-04-09 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813551#comment-16813551
 ] 

Elek, Marton commented on HADOOP-16183:
---

Thanks the clarification [~busbey]. Now It's more clean for me. I will update 
this patch to use the latest stable yetus after the next yetus release.

> Use latest Yetus to support ozone specific build process
> 
>
> Key: HADOOP-16183
> URL: https://issues.apache.org/jira/browse/HADOOP-16183
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> In YETUS-816 the hadoop personality is improved to better support ozone 
> specific changes.
> Unfortunately the hadoop personality is part of the Yetus project and not the 
> Hadoop project: we need a new yetus release or switch to an unreleased 
> version.
> In this patch I propose to use the latest commit from yetus (but use that 
> fixed commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-04-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807634#comment-16807634
 ] 

Elek, Marton commented on HADOOP-16183:
---

Thanks [~busbey] the response.

I have slightly different understanding about this policy:

{quote}
Projects SHALL publish official releases and SHALL NOT publish unreleased 
materials outside the development community.

During the process of developing software and preparing a release, various 
packages are made available to the development community for testing purposes. 
Projects MUST direct outsiders towards official releases rather than raw source 
repositories, nightly builds, snapshots, release candidates, or any other 
similar packages. The only people who are supposed to know about such developer 
resources are individuals actively participating in development or following 
the dev list and thus aware of the conditions placed on unreleased materials.
{quote}

(source: https://apache.org/legal/release-policy.html)

As far I understand it's not forbidden to use any snapshot version but it 
shouldn't be publish for wider audience.

In fact Hadoop used unreleased (or even forked) yetus multiple times. (You can 
check the config history: 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/jobConfigHistory/showDiffFiles?timestamp1=2018-07-10_14-47-51=2018-09-01_21-39-22)

 But I respect your opinion, I will ask for a release on the yetus-dev list.

BTW. Wouldn't be better to store the hadoop personality in the hadoop 
repository? According to your comment we need a release from an other project 
(yetus) to change anything in the build definition/personality. (cc [~aw])

> Use latest Yetus to support ozone specific build process
> 
>
> Key: HADOOP-16183
> URL: https://issues.apache.org/jira/browse/HADOOP-16183
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> In YETUS-816 the hadoop personality is improved to better support ozone 
> specific changes.
> Unfortunately the hadoop personality is part of the Yetus project and not the 
> Hadoop project: we need a new yetus release or switch to an unreleased 
> version.
> In this patch I propose to use the latest commit from yetus (but use that 
> fixed commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15566) Remove HTrace support

2019-04-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807619#comment-16807619
 ] 

Elek, Marton edited comment on HADOOP-15566 at 4/2/19 10:16 AM:


Thanks the questions [~bogdandrutu], sorry I missed this comment earlier.

bq. What implementation will be shipped with the official HBase binary?

I don't know it depends from the HBase imho. With using OT we can use multiple 
implementation, HBase can provide any implementation.

bq. How can somebody use a different implementation?

It should be configurable. AFAIK the only vendor specific part is the 
initialization code. It's easy to create an interface to initialize different 
implementation (eg. a class name which should be called to initialize the 
implementation.

bq. How do you ensure that a different implementation (that is not tested with 
your entire test suite) may not corrupt user data? I think it is very important 
that all the tests are running with the implementation that user uses in 
production.

I don't think that we need to test all the implementation. We should prove that 
the OT api used well and use one implementation as an example. And we clearly 
write the documentation what is tested and what is not. Not tested 
implementations which are provided by other vendors can be used but should be 
tested before by the users.

bq. use one implementation (pick something that you like the most) and export 
these data to an configurable endpoint

Interesting. Can you please give me more details? What is the configurable 
endpoint? How the tracing information would be stored?


was (Author: elek):
Thanks the questions [~bogdandrutu]

q. What implementation will be shipped with the official HBase binary?

I don't know it depends from the HBase imho. With using OT we can use multiple 
implementation, HBase can provide any implementation.

q. How can somebody use a different implementation?

It should be configurable. AFAIK the only vendor specific part is the 
initialization code. It's easy to create an interface to initialize different 
implementation (eg. a class name which should be called to initialize the 
implementation.

q. How do you ensure that a different implementation (that is not tested with 
your entire test suite) may not corrupt user data? I think it is very important 
that all the tests are running with the implementation that user uses in 
production.

I don't think that we need to test all the implementation. We should prove that 
the OT api used well and use one implementation as an example. And we clearly 
write the documentation what is tested and what is not. Not tested 
implementations which are provided by other vendors can be used but should be 
tested before by the users.

q. use one implementation (pick something that you like the most) and export 
these data to an configurable endpoint

Interesting. Can you please give me more details? What is the configurable 
endpoint? How the tracing information would be stored?

> Remove HTrace support
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15566) Remove HTrace support

2019-04-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807619#comment-16807619
 ] 

Elek, Marton commented on HADOOP-15566:
---

Thanks the questions [~bogdandrutu]

q. What implementation will be shipped with the official HBase binary?

I don't know it depends from the HBase imho. With using OT we can use multiple 
implementation, HBase can provide any implementation.

q. How can somebody use a different implementation?

It should be configurable. AFAIK the only vendor specific part is the 
initialization code. It's easy to create an interface to initialize different 
implementation (eg. a class name which should be called to initialize the 
implementation.

q. How do you ensure that a different implementation (that is not tested with 
your entire test suite) may not corrupt user data? I think it is very important 
that all the tests are running with the implementation that user uses in 
production.

I don't think that we need to test all the implementation. We should prove that 
the OT api used well and use one implementation as an example. And we clearly 
write the documentation what is tested and what is not. Not tested 
implementations which are provided by other vendors can be used but should be 
tested before by the users.

q. use one implementation (pick something that you like the most) and export 
these data to an configurable endpoint

Interesting. Can you please give me more details? What is the configurable 
endpoint? How the tracing information would be stored?

> Remove HTrace support
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16172) Update apache/hadoop:3 to 3.2.0 release

2019-03-19 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16172:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Update apache/hadoop:3 to 3.2.0 release
> ---
>
> Key: HADOOP-16172
> URL: https://issues.apache.org/jira/browse/HADOOP-16172
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HADOOP-16172-docker-hadoop-3.01.patch
>
>
> This ticket is opened to update apache/hadoop:3 from 3.1.1 to 3.2.0 release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16172) Update apache/hadoop:3 to 3.2.0 release

2019-03-19 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796267#comment-16796267
 ] 

Elek, Marton commented on HADOOP-16172:
---

+1 thanks to update this.

Will commit to the docker-hadoop-3 branch soon.

> Update apache/hadoop:3 to 3.2.0 release
> ---
>
> Key: HADOOP-16172
> URL: https://issues.apache.org/jira/browse/HADOOP-16172
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Attachments: HADOOP-16172-docker-hadoop-3.01.patch
>
>
> This ticket is opened to update apache/hadoop:3 from 3.1.1 to 3.2.0 release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16063) Docker based pseudo-cluster definitions and test scripts for Hdfs/Yarn

2019-03-13 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16063:
--
Labels:   (was: newbie)

> Docker based pseudo-cluster definitions and test scripts for Hdfs/Yarn
> --
>
> Key: HADOOP-16063
> URL: https://issues.apache.org/jira/browse/HADOOP-16063
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Elek, Marton
>Priority: Major
>
> During the recent releases of Apache Hadoop Ozone we had multiple experiments 
> using docker/docker-compose to support the development of ozone.
> As of now the hadoop-ozone distribution contains two directories in 
> additional the regular hadoop directories (bin, share/lib, etc
> h3. compose
> The ./compose directory of the distribution contains different type of 
> pseudo-cluster definitions. To start an ozone cluster is as easy as "cd 
> compose/ozone && docker-compose up-d"
> The clusters also could be scaled up and down (docker-compose scale 
> datanode=3)
> There are multiple cluster definitions for different use cases (for example 
> ozone+s3 or hdfs+ozone).
> The docker-compose files are based on apache/hadoop-runner image which is an 
> "empty" image. It doesnt' contain any hadoop distribution. Instead the 
> current hadoop is used (the ../.. is mapped as a volume at /opt/hadoop)
> With this approach it's very easy to 1) start a cluster from the distribution 
> 2) test any patch from the dev tree, as after any build a new cluster can be 
> started easily (with multiple nodes and datanodes)
> h3. smoketest
> We also started to use a simple robotframework based test suite. (see 
> ./smoketest directory). It's a high level test definition very similar to the 
> smoketests which are executed manually by the contributors during a release 
> vote.
> But it's a formal definition to start cluster from different docker-compose 
> definitions and execute simple shell scripts (and compare the output).
>  
> I believe that both approaches helped a lot during the development of ozone 
> and I propose to do the same improvements on the main hadoop distribution.
> I propose to provide docker-compose based example cluster definitions for 
> yarn/hdfs and for different use cases (simple hdfs, router based federation, 
> etc.)
> It can help to understand the different configuration and try out new 
> features with predefined config set.
> Long term we can also add robottests to help the release votes (basic 
> wordcount/mr tests could be scripted)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16183) Use latest Yetus to support ozone specific build process

2019-03-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16183:
-

 Summary: Use latest Yetus to support ozone specific build process
 Key: HADOOP-16183
 URL: https://issues.apache.org/jira/browse/HADOOP-16183
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton
Assignee: Elek, Marton


In YETUS-816 the hadoop personality is improved to better support ozone 
specific changes.

Unfortunately the hadoop personality is part of the Yetus project and not the 
Hadoop project: we need a new yetus release or switch to an unreleased version.

In this patch I propose to use the latest commit from yetus (but use that fixed 
commit instead updating all the time). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16067) Incorrect Format Debug Statement KMSACLs

2019-02-28 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780407#comment-16780407
 ] 

Elek, Marton commented on HADOOP-16067:
---

+1. I will commit it to the trunk, soon.

Thank you [~charanh] the contribution.

> Incorrect Format Debug Statement KMSACLs
> 
>
> Key: HADOOP-16067
> URL: https://issues.apache.org/jira/browse/HADOOP-16067
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: Charan Hebri
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HADOOP-16067.001.patch
>
>
> {code:java}
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Checking user [{}] for: {}: {}" + ugi.getShortUserName(),
> opType.toString(), acl.getAclString());
>   }
> {code}
> The logging message here is incorrect because the first variable is being 
> concatenated to the string instead of being passed as an argument.
> {code:java}
> -- Notice the user name 'hdfs' at the end and the spare curly brackets
> 2019-01-23 13:27:45,244 DEBUG 
> org.apache.hadoop.crypto.key.kms.server.KMSACLs: Checking user [GENERATE_EEK] 
> for: hdfs supergroup: {}hdfs
> {code}
> [https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSACLs.java#L313-L316]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16067) Incorrect Format Debug Statement KMSACLs

2019-02-28 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16067:
--
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

> Incorrect Format Debug Statement KMSACLs
> 
>
> Key: HADOOP-16067
> URL: https://issues.apache.org/jira/browse/HADOOP-16067
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: Charan Hebri
>Priority: Trivial
>  Labels: newbie, noob
> Fix For: 3.3.0
>
> Attachments: HADOOP-16067.001.patch
>
>
> {code:java}
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Checking user [{}] for: {}: {}" + ugi.getShortUserName(),
> opType.toString(), acl.getAclString());
>   }
> {code}
> The logging message here is incorrect because the first variable is being 
> concatenated to the string instead of being passed as an argument.
> {code:java}
> -- Notice the user name 'hdfs' at the end and the spare curly brackets
> 2019-01-23 13:27:45,244 DEBUG 
> org.apache.hadoop.crypto.key.kms.server.KMSACLs: Checking user [GENERATE_EEK] 
> for: hdfs supergroup: {}hdfs
> {code}
> [https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSACLs.java#L313-L316]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16035) Jenkinsfile for Hadoop

2019-02-27 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779134#comment-16779134
 ] 

Elek, Marton commented on HADOOP-16035:
---

bq. Like I told you back in October-ish? via the hadoop personality. 

Link to that conversion (for the reference): HDDS-891

> Jenkinsfile for Hadoop
> --
>
> Key: HADOOP-16035
> URL: https://issues.apache.org/jira/browse/HADOOP-16035
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16035.00.patch, HADOOP-16035.01.patch
>
>
> In order to enable Github Branch Source plugin on Jenkins to test Github PRs 
> with Apache Yetus:
> - an account that can read Github
> - Apache Yetus 0.9.0+
> - a Jenkinsfile that uses the above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16035) Jenkinsfile for Hadoop

2019-02-27 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779121#comment-16779121
 ] 

Elek, Marton edited comment on HADOOP-16035 at 2/27/19 10:34 AM:
-

Hadoop personality improvement is not ready yet. And it's very annoying that 
this job is enabled for all the PRs not only for the HADOOP/HDFS/YARN jira 
projects without proper support in place.

The biggest problem is that the \-P hdds flag is not enabled for PR check if 
the PR has the ozone labels. This is not the problem of the personalities as in 
the personality we plan to do more advanced project list detection.

I think a quick fix would be to add \-P hdds to the Jenkinsfile somehow. Do you 
have any hint how can we do it?

 * Can we add it to the Jenkinsfile all the time? Yetus should handle the 
dependencies and by default ozone/hdds won't be executed by Yetus if only 
HDFS/Yetus projects are changed.
 * Or should we add the \-P hdds in the personality? Is it supported in Yetus?  


was (Author: elek):
Hadoop personality improvement is not ready yet. And it's very annoying that 
this job is enabled for all the PRs not only for the HADOOP/HDFS/YARN jira 
projects without proper support in place. 

> Jenkinsfile for Hadoop
> --
>
> Key: HADOOP-16035
> URL: https://issues.apache.org/jira/browse/HADOOP-16035
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16035.00.patch, HADOOP-16035.01.patch
>
>
> In order to enable Github Branch Source plugin on Jenkins to test Github PRs 
> with Apache Yetus:
> - an account that can read Github
> - Apache Yetus 0.9.0+
> - a Jenkinsfile that uses the above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16035) Jenkinsfile for Hadoop

2019-02-27 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779121#comment-16779121
 ] 

Elek, Marton commented on HADOOP-16035:
---

Hadoop personality improvement is not ready yet. And it's very annoying that 
this job is enabled for all the PRs not only for the HADOOP/HDFS/YARN jira 
projects without proper support in place. 

> Jenkinsfile for Hadoop
> --
>
> Key: HADOOP-16035
> URL: https://issues.apache.org/jira/browse/HADOOP-16035
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16035.00.patch, HADOOP-16035.01.patch
>
>
> In order to enable Github Branch Source plugin on Jenkins to test Github PRs 
> with Apache Yetus:
> - an account that can read Github
> - Apache Yetus 0.9.0+
> - a Jenkinsfile that uses the above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16146) Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN

2019-02-25 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16146:
--
Status: Patch Available  (was: Open)

> Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN
> --
>
> Key: HADOOP-16146
> URL: https://issues.apache.org/jira/browse/HADOOP-16146
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> [~aw] reported the problem in HDDS-891:
> {quote}DOCKER_INTERACTIVE_RUN opens the door for users to set command line 
> options to docker. Most notably, -c and -v and a few others that share one 
> particular characteristic: they reference the file system. As soon as shell 
> code hits the file system, it is no longer safe to assume space delimited 
> options. In other words, -c /My Cool Filesystem/Docker Files/config.json or 
> -v /c_drive/Program Files/Data:/data may be something a user wants to do, but 
> the script now breaks because of the IFS assumptions.
> {quote}
> DOCKER_INTERACTIVE_RUN was used in jenkins to run normal build process in 
> docker. In case of DOCKER_INTERACTIVE_RUN was set to empty the docker 
> container is started without the "-i -t" flags.
> It can be improved by checking the value of the environment variable and 
> enable only fixed set of values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16035) Jenkinsfile for Hadoop

2019-02-25 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776682#comment-16776682
 ] 

Elek, Marton commented on HADOOP-16035:
---

FTR: This Jenkins file doesn't support the optional subprojects (like submarine 
and ozone) and generates a lot of noise by 
[https://builds.apache.org/job/hadoop-multibranch]. Is there any suggestion how 
hdds/ozone/submarine projects can be supported? (See for example this report: 
https://github.com/apache/hadoop/pull/513)

> Jenkinsfile for Hadoop
> --
>
> Key: HADOOP-16035
> URL: https://issues.apache.org/jira/browse/HADOOP-16035
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16035.00.patch, HADOOP-16035.01.patch
>
>
> In order to enable Github Branch Source plugin on Jenkins to test Github PRs 
> with Apache Yetus:
> - an account that can read Github
> - Apache Yetus 0.9.0+
> - a Jenkinsfile that uses the above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16146) Make start-build-env.sh safe in case of misusage of DOCKER_INTERACTIVE_RUN

2019-02-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16146:
-

 Summary: Make start-build-env.sh safe in case of misusage of 
DOCKER_INTERACTIVE_RUN
 Key: HADOOP-16146
 URL: https://issues.apache.org/jira/browse/HADOOP-16146
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


[~aw] reported the problem in HDDS-891:
{quote}DOCKER_INTERACTIVE_RUN opens the door for users to set command line 
options to docker. Most notably, -c and -v and a few others that share one 
particular characteristic: they reference the file system. As soon as shell 
code hits the file system, it is no longer safe to assume space delimited 
options. In other words, -c /My Cool Filesystem/Docker Files/config.json or -v 
/c_drive/Program Files/Data:/data may be something a user wants to do, but the 
script now breaks because of the IFS assumptions.
{quote}
DOCKER_INTERACTIVE_RUN was used in jenkins to run normal build process in 
docker. In case of DOCKER_INTERACTIVE_RUN was set to empty the docker container 
is started without the "-i -t" flags.

It can be improved by checking the value of the environment variable and enable 
only fixed set of values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-02-11 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765148#comment-16765148
 ] 

Elek, Marton commented on HADOOP-16091:
---

Thanks to share the technical details [~eyang].
 # For me the file based activation is not enough. I wouldn't like to build a 
new docker image with each build. I think it should be activated with explicit 
profile declaration.
 # With this approach for the docker based builds (eg. release builds, jenkins 
builds) we need docker-in-docker base image or we need to map the docker.sock 
from outside to inside.
 # My questions are still open: I think the we need a method to 
upgrade/modify/create images for existing releases, especially:
 ## adding security fixes to existing, released images
 ## creating new images for older releases
 # I think the containers are more reproducible if they are based on released 
tar files. 

 

> Create hadoop/ozone docker images with inline build process
> ---
>
> Key: HADOOP-16091
> URL: https://issues.apache.org/jira/browse/HADOOP-16091
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making docker build inline if possible.
> {quote}
> The main challenges are also discussed in the thread:
> {code:java}
> 3. Technically it would be possible to add the Dockerfile to the source
> tree and publish the docker image together with the release by the
> release manager but it's also problematic:
> {code}
> a) there is no easy way to stage the images for the vote
>  c) it couldn't be flagged as automated on dockerhub
>  d) It couldn't support the critical updates.
>  * Updating existing images (for example in case of an ssl bug, rebuild
>  all the existing images with exactly the same payload but updated base
>  image/os environment)
>  * Creating image for older releases (We would like to provide images,
>  for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
>  with different versions).
> {code:java}
>  {code}
> The a) can be solved (as [~eyang] suggested) with using a personal docker 
> image during the vote and publish it to the dockerhub after the vote (in case 
> the permission can be set by the INFRA)
> Note: based on LEGAL-270 and linked discussion both approaches (inline build 
> process / external build process) are compatible with the apache release.
> Note: HDDS-851 and HADOOP-14898 contains more information about these 
> problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-02-11 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16091:
--
Description: 
This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.
{quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
using Apache Organization. By browsing Apache github mirror. There are only 7 
projects using a separate repository for docker image build. Popular projects 
official images are not from Apache organization, such as zookeeper, tomcat, 
httpd. We may not disrupt what other Apache projects are doing, but it looks 
like inline build process is widely employed by majority of projects such as 
Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
chaotic for Apache as a whole. However, Hadoop community can decide what is 
best for Hadoop. My preference is to remove ozone from source tree naming, if 
Ozone is intended to be subproject of Hadoop for long period of time. This 
enables Hadoop community to host docker images for various subproject without 
having to check out several source tree to trigger a grand build. However, 
inline build process seems more popular than separated process. Hence, I highly 
recommend making docker build inline if possible.
{quote}
The main challenges are also discussed in the thread:
{code:java}
3. Technically it would be possible to add the Dockerfile to the source
tree and publish the docker image together with the release by the
release manager but it's also problematic:

{code}
a) there is no easy way to stage the images for the vote
 c) it couldn't be flagged as automated on dockerhub
 d) It couldn't support the critical updates.
 * Updating existing images (for example in case of an ssl bug, rebuild
 all the existing images with exactly the same payload but updated base
 image/os environment)

 * Creating image for older releases (We would like to provide images,
 for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
 with different versions).

{code:java}
 {code}
The a) can be solved (as [~eyang] suggested) with using a personal docker image 
during the vote and publish it to the dockerhub after the vote (in case the 
permission can be set by the INFRA)

Note: based on LEGAL-270 and linked discussion both approaches (inline build 
process / external build process) are compatible with the apache release.

Note: HDDS-851 and HADOOP-14898 contains more information about these problems.

  was:
This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.

bq. 1, 3. There are 38 Apache projects hosting docker images on Docker hub 
using Apache Organization.  By browsing Apache github mirror.  There are only 7 
projects using a separate repository for docker image build.  Popular projects 
official images are not from Apache organization, such as zookeeper, tomcat, 
httpd.  We may not disrupt what other Apache projects are doing, but it looks 
like inline build process is widely employed by majority of projects such as 
Nifi, Brooklyn, thrift, karaf, syncope and others.  The situation seems a bit 
chaotic for Apache as a whole.  However, Hadoop community can decide what is 
best for Hadoop.  My preference is to remove ozone from source tree naming, if 
Ozone is intended to be subproject of Hadoop for long period of time.  This 
enables Hadoop community to host docker images for various subproject without 
having to check out several source tree to trigger a grand build.  However, 
inline build process seems more popular than separated process.  Hence, I 
highly recommend making docker build inline if possible.

The main challenges are also discussed in the thread:

{code}
3. Technically it would be possible to add the Dockerfile to the source
tree and publish the docker image together with the release by the
release manager but it's also problematic:

{code}
  a) there is no easy way to stage the images for the vote
  c) it couldn't be flagged as automated on dockerhub
  d) It couldn't support the critical updates.


 * Updating existing images (for example in case of an ssl bug, rebuild
all the existing images with exactly the same payload but updated base
image/os environment)

 * Creating image for older releases (We would like to provide images,
for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
with different versions).


{code}

The a) can be solved (as [~eyang] suggested) with using a personal docker image 
during the vote and publish it to the dockerhub after the vote (in case the 
permission can be set by the INFRA)

Note: based on LEGAL-270 and linked discussion both approaches (inline build 
process / external 

[jira] [Comment Edited] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-02-11 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765128#comment-16765128
 ] 

Elek, Marton edited comment on HADOOP-16092 at 2/11/19 4:30 PM:


Thanks [~eyang] the comment.

It works together with HADOOP-16091. But it's independent from HADOOP-16091: 
not all the containers can be created from maven. Eg. we have build/base-runner 
images.

Until now there was a strong limitation on dockerhub. If you defined a 
branch->dockertag mapping it was not possible to use the backref of the regular 
expression.

Let's say we have the following branch-tag mapping for the same repository:
||Branch name||container tag||
|hadooprunner-(.*)|{sourceref}|
|hadoop-(.*)|{sourceref}|

With these settings, a hadoop-2.7.0 branch was used to create a docker image 
with tag hadoop-2.7.0. And instead of apache/hadoop:2.7.0 we got an 
apache/hadoop:hadoop-2.7.0. As a workaround we started to use fixed mapping 
without regular expression but it made very hard to support multiple versions 
(that's the reason why we have only hadoop:2 and hadoop:3).

I tested it again recently, and with the latest dockerhub version there is no 
such limitation fortunately. Now we can also use the \{\1} ref.
||Branch name||container tag||
|hadooprunner-(.*)|{\1}|
|hadoop-(.*)|{\1}|

Long story short, with this improvement we can move all the ozone images (from 
hadoop-docker-ozone repo) and hadoop images (from 3 branches of hadoop repo) to 
the same dedicated repository (hadoop-docker).

hadoop-docker.it repository is not yet created, but if you create it with self 
service (can be created only by members), I would be happy to move the existing 
images to there under this jira.


was (Author: elek):
Thanks [~eyang] the comment.

It works together with HADOOP-16091. But it's independent from HADOOP-16091: 
not all the containers can be created from maven. Eg. we have build/base-runner 
images.

Until now there was a strong limitation on dockerhub. If you defined a 
branch->dockertag mapping it was not possible to use the backref of the regular 
expression.

Let's say we have the following branch-tag mapping for the same repository:
||Branch name||container tag||
|hadooprunner-(.*)|{sourceref}|
|ozonerunner-(.*)|{sourceref}|
|hadoop-(.*)|{sourceref}|
|ozone-(.*)|{sourceref}|

With these settings, a hadoop-2.7.0 branch was used to create a docker image 
with tag hadoop-2.7.0. And instead of apache/hadoop:2.7.0 we got an 
apache/hadoop:hadoop-2.7.0. As a workaround we started to use fixed mapping 
without regular expression but it made very hard to support multiple versions 
(that's the reason why we have only hadoop:2 and hadoop:3).

I tested it again recently, and with the latest dockerhub version there is no 
such limitation fortunately. Now we can also use the \{\1} ref.
||Branch name||container tag||final tag||
|hadooprunner-(.*)|{\1}|
|ozonerunner-(.*)|{\1}|
|hadoop-(.*)|{\1}|
|ozone-(.*)|{\1}|

Long story short, with this improvement we can move all the ozone images (from 
hadoop-docker-ozone repo) and hadoop images (from 3 branches of hadoop repo) to 
the same dedicated repository (hadoop-docker).

hadoop-docker repository is not yet created, but if you create it with self 
service, I would be happy to move the existing images to there under this jira.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this 

[jira] [Commented] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-02-11 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765128#comment-16765128
 ] 

Elek, Marton commented on HADOOP-16092:
---

Thanks [~eyang] the comment.

It works together with HADOOP-16091. But it's independent from HADOOP-16091: 
not all the containers can be created from maven. Eg. we have build/base-runner 
images.

Until now there was a strong limitation on dockerhub. If you defined a 
branch->dockertag mapping it was not possible to use the backref of the regular 
expression.

Let's say we have the following branch-tag mapping for the same repository:
||Branch name||container tag||
|hadooprunner-(.*)|{sourceref}|
|ozonerunner-(.*)|{sourceref}|
|hadoop-(.*)|{sourceref}|
|ozone-(.*)|{sourceref}|

With these settings, a hadoop-2.7.0 branch was used to create a docker image 
with tag hadoop-2.7.0. And instead of apache/hadoop:2.7.0 we got an 
apache/hadoop:hadoop-2.7.0. As a workaround we started to use fixed mapping 
without regular expression but it made very hard to support multiple versions 
(that's the reason why we have only hadoop:2 and hadoop:3).

I tested it again recently, and with the latest dockerhub version there is no 
such limitation fortunately. Now we can also use the \{\1} ref.
||Branch name||container tag||final tag||
|hadooprunner-(.*)|{\1}|
|ozonerunner-(.*)|{\1}|
|hadoop-(.*)|{\1}|
|ozone-(.*)|{\1}|

Long story short, with this improvement we can move all the ozone images (from 
hadoop-docker-ozone repo) and hadoop images (from 3 branches of hadoop repo) to 
the same dedicated repository (hadoop-docker).

hadoop-docker repository is not yet created, but if you create it with self 
service, I would be happy to move the existing images to there under this jira.

> Move the source of hadoop/ozone containers to the same repository
> -
>
> Key: HADOOP-16092
> URL: https://issues.apache.org/jira/browse/HADOOP-16092
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Priority: Major
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
> remove ozone from source tree naming, if Ozone is intended to be subproject 
> of Hadoop for long period of time.  This enables Hadoop community to host 
> docker images for various subproject without having to check out several 
> source tree to trigger a grand build
> As of now the source of  hadoop docker images are stored in the hadoop git 
> repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
> repository for ozone (all branches).
> As it's discussed in HDDS-851 the biggest challenge to solve here is the 
> mapping between git branches and dockerhub tags. It's not possible to use the 
> captured part of a github branch.
> For example it's not possible to define a rule to build all the ozone-(.*) 
> branches and use a tag $1 for it. Without this support we need to create a 
> new mapping for all the releases manually (with the help of the INFRA).
> Note: HADOOP-16091 can solve this problem as it doesn't require branch 
> mapping any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2019-02-07 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762945#comment-16762945
 ] 

Elek, Marton commented on HADOOP-14163:
---

Thank you very much the remainder [~xkrogen]. I was not aware of the 
HowToCommit only of the HowToRelease.

Now I updated the HowToCommit page as well.

> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, 
> HADOOP-14163-003.zip, HADOOP-14163.004.patch, HADOOP-14163.005.patch, 
> HADOOP-14163.006.patch, HADOOP-14163.007.patch, HADOOP-14163.008.tar.gz, 
> HADOOP-14163.009.patch, HADOOP-14163.009.tar.gz, HADOOP-14163.010.tar.gz, 
> hadoop-site.tar.gz, hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16092) Move the source of hadoop/ozone containers to the same repository

2019-02-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16092:
-

 Summary: Move the source of hadoop/ozone containers to the same 
repository
 Key: HADOOP-16092
 URL: https://issues.apache.org/jira/browse/HADOOP-16092
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.

bq. Hadoop community can decide what is best for Hadoop.  My preference is to 
remove ozone from source tree naming, if Ozone is intended to be subproject of 
Hadoop for long period of time.  This enables Hadoop community to host docker 
images for various subproject without having to check out several source tree 
to trigger a grand build

As of now the source of  hadoop docker images are stored in the hadoop git 
repository (docker-* branches) for hadoop and in hadoop-docker-ozone git 
repository for ozone (all branches).

As it's discussed in HDDS-851 the biggest challenge to solve here is the 
mapping between git branches and dockerhub tags. It's not possible to use the 
captured part of a github branch.

For example it's not possible to define a rule to build all the ozone-(.*) 
branches and use a tag $1 for it. Without this support we need to create a new 
mapping for all the releases manually (with the help of the INFRA).

Note: HADOOP-16091 can solve this problem as it doesn't require branch mapping 
any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16091) Create hadoop/ozone docker images with inline build process

2019-02-05 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16091:
-

 Summary: Create hadoop/ozone docker images with inline build 
process
 Key: HADOOP-16091
 URL: https://issues.apache.org/jira/browse/HADOOP-16091
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Elek, Marton


This is proposed by [~eyang] in 
[this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
 mailing thread.

bq. 1, 3. There are 38 Apache projects hosting docker images on Docker hub 
using Apache Organization.  By browsing Apache github mirror.  There are only 7 
projects using a separate repository for docker image build.  Popular projects 
official images are not from Apache organization, such as zookeeper, tomcat, 
httpd.  We may not disrupt what other Apache projects are doing, but it looks 
like inline build process is widely employed by majority of projects such as 
Nifi, Brooklyn, thrift, karaf, syncope and others.  The situation seems a bit 
chaotic for Apache as a whole.  However, Hadoop community can decide what is 
best for Hadoop.  My preference is to remove ozone from source tree naming, if 
Ozone is intended to be subproject of Hadoop for long period of time.  This 
enables Hadoop community to host docker images for various subproject without 
having to check out several source tree to trigger a grand build.  However, 
inline build process seems more popular than separated process.  Hence, I 
highly recommend making docker build inline if possible.

The main challenges are also discussed in the thread:

{code}
3. Technically it would be possible to add the Dockerfile to the source
tree and publish the docker image together with the release by the
release manager but it's also problematic:

{code}
  a) there is no easy way to stage the images for the vote
  c) it couldn't be flagged as automated on dockerhub
  d) It couldn't support the critical updates.


 * Updating existing images (for example in case of an ssl bug, rebuild
all the existing images with exactly the same payload but updated base
image/os environment)

 * Creating image for older releases (We would like to provide images,
for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
with different versions).


{code}

The a) can be solved (as [~eyang] suggested) with using a personal docker image 
during the vote and publish it to the dockerhub after the vote (in case the 
permission can be set by the INFRA)

Note: based on LEGAL-270 and linked discussion both approaches (inline build 
process / external build process) are compatible with the apache release.

Note: HDDS-851 and HADOOP-14898 contains more information about these problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15566) Remove HTrace support

2019-01-26 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752975#comment-16752975
 ] 

Elek, Marton commented on HADOOP-15566:
---

Thanks the idea [~cmccabe]. It's interesting.

Fix me If I am wrong, but as I see the HTrace is not designed to be extensible. 
For example the Span is an interface but Tracer always creates the MilliSpan 
implementation. To use HTrace as a lightweight layer and support multiple 
tracing implementation (such as opentracing or opencensus) we need to refactor 
the HTrace code. I have two problems with this approach:

 1) The new refactored HTrace won't be compatible the old HTrace. Would be hard 
to support old HTrace.

 2) It wold be equivalent to resurrect the HTrace which is voted to retire. 
(The some thing can be done without importing HTrace code to the Hadoop but 
refactor it on the HTrace side)

But it's a valid concern about creating a new layer (even if Cassandra also 
followed this approach as @mck wrote it). For me it's hard to compare the 
complexity of maintain an own lightweight abstraction layer and maintaining 
HTrace. (Even if the first one seems to be easier).

I think the real alternative here is just to use OpenTracing (despite the 
concerns about the governance raised by [~michaelsembwever]) And follow the 
approach which is prototyped by [~jojochuang], [~fabbri], [~rizaon])

Or (as a first step) it could be added to the existing HTrace code, 
side-by-side, to evaluate it.

 

 

 

 

 

 

> Remove HTrace support
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16064:
--
Attachment: HADOOP-16064.001.patch

> Load configuration values from external sources
> ---
>
> Key: HADOOP-16064
> URL: https://issues.apache.org/jira/browse/HADOOP-16064
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-16064.001.patch
>
>
> This is a proposal to improve the Configuration.java to load configuration 
> from external sources (kubernetes config map, external http reqeust, any 
> cluster manager like ambari, etc.)
> I will attach a patch to illustrate the proposed solution, but please comment 
> the concept first, the patch is just poc and not fully implemented.
> *Goals:*
>  * Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
> external locations instead of the classpath (classpath remains the default)
>  * Make the configuration loading extensible
>  * Make it in an backward-compatible way with minimal change in the existing 
> Configuration.java
> *Use-cases:*
>  1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
> this approach only the namenode should be configured, other components 
> require only the url of the namenode
>  2.) Read configuration directly from kubernetes config-map (or mesos)
>  3.) Read configuration from any external cluster management (such as Apache 
> Ambari or any equivalent)
>  4.) as of now in the hadoop docker images we transform environment variables 
> (such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
> of a python script. With the proposed implementation it would be possible to 
> read the configuration directly from the system environment variables.
> *Problem:*
> The existing Configuration.java can read configuration from multiple sources. 
> But most of the time it's used to load predefined config names 
> ("core-site.xml" and "hdfs-site.xml") without configuration location. In this 
> case the files will be loaded from the classpath.
> I propose to add additional option to define the default location of 
> core-site.xml and hdfs-site.xml (any configuration which is defined by string 
> name) to use external sources in the classpath.
> The configuration loading requires implementation + configuration (where are 
> the external configs). We can't use regular configuration to configure the 
> config loader (chicken/egg).
> I propose to use a new environment variable HADOOP_CONF_SOURCE
> The environment variable could contain a URL, where the schema of the url can 
> define the config source and all the other parts can configure the access to 
> the resource.
> Examples:
> HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]
> HADOOP_CONF_SOURCE=env://prefix
> HADOOP_CONF_SOURCE=k8s://config-map-name
> The ConfigurationSource interface can be as easy as:
> {code:java}
> /**
>  * Interface to load hadoop configuration from custom location.
>  */
> public interface ConfigurationSource {
>   /**
>    * Method will be called one with the defined configuration url.
>    *
>    * @param uri
>    */
>   void initialize(URI uri) throws IOException;
>   /**
>    * Method will be called to load a specific configuration resource.
>    *
>    * @param name of the configuration resource (eg. hdfs-site.xml)
>    * @return List of loaded configuraiton key and values.
>    */
>   List readConfiguration(String name);
> }{code}
> We can choose the right implementation based the schema of the uri and with 
> Java Service Provider Interface mechanism 
> (META-INF/services/org.apache.hadoop.conf.ConfigurationSource)
> It could be with minimal modification in the Configuration.java (see the 
> attached patch as an example)
>  The patch contains two example implementation:
> *hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*
> This can load configuration from environment variables based on a naming 
> convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)
> *hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*
>  This implementation can load the configuration from a /conf servlet of any 
> Hadoop components.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16064:
--
Description: 
This is a proposal to improve the Configuration.java to load configuration from 
external sources (kubernetes config map, external http reqeust, any cluster 
manager like ambari, etc.)

I will attach a patch to illustrate the proposed solution, but please comment 
the concept first, the patch is just poc and not fully implemented.

*Goals:*
 * Load the configuration files (core-site.xml/hdfs-site.xml/...) from external 
locations instead of the classpath (classpath remains the default)
 * Make the configuration loading extensible
 * Make it in an backward-compatible way with minimal change in the existing 
Configuration.java

*Use-cases:*

 1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
this approach only the namenode should be configured, other components require 
only the url of the namenode

 2.) Read configuration directly from kubernetes config-map (or mesos)

 3.) Read configuration from any external cluster management (such as Apache 
Ambari or any equivalent)

 4.) as of now in the hadoop docker images we transform environment variables 
(such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
of a python script. With the proposed implementation it would be possible to 
read the configuration directly from the system environment variables.

*Problem:*

The existing Configuration.java can read configuration from multiple sources. 
But most of the time it's used to load predefined config names ("core-site.xml" 
and "hdfs-site.xml") without configuration location. In this case the files 
will be loaded from the classpath.

I propose to add additional option to define the default location of 
core-site.xml and hdfs-site.xml (any configuration which is defined by string 
name) to use external sources in the classpath.

The configuration loading requires implementation + configuration (where are 
the external configs). We can't use regular configuration to configure the 
config loader (chicken/egg).

I propose to use a new environment variable HADOOP_CONF_SOURCE

The environment variable could contain a URL, where the schema of the url can 
define the config source and all the other parts can configure the access to 
the resource.

Examples:

HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]

HADOOP_CONF_SOURCE=env://prefix

HADOOP_CONF_SOURCE=k8s://config-map-name

The ConfigurationSource interface can be as easy as:
{code:java}
/**
 * Interface to load hadoop configuration from custom location.
 */
public interface ConfigurationSource {

  /**
   * Method will be called one with the defined configuration url.
   *
   * @param uri
   */
  void initialize(URI uri) throws IOException;

  /**
   * Method will be called to load a specific configuration resource.
   *
   * @param name of the configuration resource (eg. hdfs-site.xml)
   * @return List of loaded configuraiton key and values.
   */
  List readConfiguration(String name);

}{code}
We can choose the right implementation based the schema of the uri and with 
Java Service Provider Interface mechanism 
(META-INF/services/org.apache.hadoop.conf.ConfigurationSource)

It could be with minimal modification in the Configuration.java (see the 
attached patch as an example)

 The patch contains two example implementation:

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*

This can load configuration from environment variables based on a naming 
convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*

 This implementation can load the configuration from a /conf servlet of any 
Hadoop components.

 

  was:
This is a proposal to improve the Configuration.java to load configuration from 
external sources (kubernetes config map, external http reqeust, any cluster 
manager like ambari, etc.)

I will attach a patch to illustrate the proposed solution, but please comment 
the concept first, the patch is just poc and not fully implemented.

*Goals:*
 * **Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
external locations instead of the classpath (classpath remains the default)
 * Make the configuration loading extensible
 * Make it in an backward-compatible way with minimal change in the existing 
Configuration.java

*Use-cases:*

 1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
this approach only the namenode should be configured, other components require 
only the url of the namenode

 2.) Read configuration directly from kubernetes config-map (or mesos)

 3.) Read configuration from any external cluster management (such as Apache 
Ambari or any equivalent)

 4.) as of now in the hadoop docker images we 

[jira] [Created] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16064:
-

 Summary: Load configuration values from external sources
 Key: HADOOP-16064
 URL: https://issues.apache.org/jira/browse/HADOOP-16064
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Elek, Marton


This is a proposal to improve the Configuration.java to load configuration from 
external sources (kubernetes config map, external http reqeust, any cluster 
manager like ambari, etc.)

I will attach a patch to illustrate the proposed solution, but please comment 
the concept first, the patch is just poc and not fully implemented.

*Goals:*
 * **Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
external locations instead of the classpath (classpath remains the default)
 * Make the configuration loading extensible
 * Make it in an backward-compatible way with minimal change in the existing 
Configuration.java

*Use-cases:*

 1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
this approach only the namenode should be configured, other components require 
only the url of the namenode

 2.) Read configuration directly from kubernetes config-map (or mesos)

 3.) Read configuration from any external cluster management (such as Apache 
Ambari or any equivalent)

 4.) as of now in the hadoop docker images we transform environment variables 
(such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
of a python script. With the proposed implementation it would be possible to 
read the configuration directly from the system environment variables.

*Problem:*

The existing Configuration.java can read configuration from multiple sources. 
But most of the time it's used to load predefined config names ("core-site.xml" 
and "hdfs-site.xml") without configuration location. In this case the files 
will be loaded from the classpath.

I propose to add additional option to define the default location of 
core-site.xml and hdfs-site.xml (any configuration which is defined by string 
name) to use external sources in the classpath.

The configuration loading requires implementation + configuration (where are 
the external configs). We can't use regular configuration to configure the 
config loader (chicken/egg).

I propose to use a new environment variable HADOOP_CONF_SOURCE

The environment variable could contain a URL, where the schema of the url can 
define the config source and all the other parts can configure the access to 
the resource.

Examples:

HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]

HADOOP_CONF_SOURCE=env://prefix

HADOOP_CONF_SOURCE=k8s://config-map-name

The ConfigurationSource interface can be as easy as:
{code:java}
/**
 * Interface to load hadoop configuration from custom location.
 */
public interface ConfigurationSource {

  /**
   * Method will be called one with the defined configuration url.
   *
   * @param uri
   */
  void initialize(URI uri) throws IOException;

  /**
   * Method will be called to load a specific configuration resource.
   *
   * @param name of the configuration resource (eg. hdfs-site.xml)
   * @return List of loaded configuraiton key and values.
   */
  List readConfiguration(String name);

}{code}
We can choose the right implementation based the schema of the uri and with 
Java Service Provider Interface mechanism 
(META-INF/services/org.apache.hadoop.conf.ConfigurationSource)

It could be with minimal modification in the Configuration.java (see the 
attached patch as an example)

 The patch contains two example implementation:

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*

This can load configuration from environment variables based on a naming 
convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)

*hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*

 This implementation can load the configuration from a /conf servlet of any 
Hadoop components.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16064) Load configuration values from external sources

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reassigned HADOOP-16064:
-

Assignee: Elek, Marton

> Load configuration values from external sources
> ---
>
> Key: HADOOP-16064
> URL: https://issues.apache.org/jira/browse/HADOOP-16064
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> This is a proposal to improve the Configuration.java to load configuration 
> from external sources (kubernetes config map, external http reqeust, any 
> cluster manager like ambari, etc.)
> I will attach a patch to illustrate the proposed solution, but please comment 
> the concept first, the patch is just poc and not fully implemented.
> *Goals:*
>  * **Load the configuration files (core-site.xml/hdfs-site.xml/...) from 
> external locations instead of the classpath (classpath remains the default)
>  * Make the configuration loading extensible
>  * Make it in an backward-compatible way with minimal change in the existing 
> Configuration.java
> *Use-cases:*
>  1.) load configuration from the namenode ([http://namenode:9878/conf]). With 
> this approach only the namenode should be configured, other components 
> require only the url of the namenode
>  2.) Read configuration directly from kubernetes config-map (or mesos)
>  3.) Read configuration from any external cluster management (such as Apache 
> Ambari or any equivalent)
>  4.) as of now in the hadoop docker images we transform environment variables 
> (such as HDFS-SITE.XML_fs.defaultFs) to configuration xml files with the help 
> of a python script. With the proposed implementation it would be possible to 
> read the configuration directly from the system environment variables.
> *Problem:*
> The existing Configuration.java can read configuration from multiple sources. 
> But most of the time it's used to load predefined config names 
> ("core-site.xml" and "hdfs-site.xml") without configuration location. In this 
> case the files will be loaded from the classpath.
> I propose to add additional option to define the default location of 
> core-site.xml and hdfs-site.xml (any configuration which is defined by string 
> name) to use external sources in the classpath.
> The configuration loading requires implementation + configuration (where are 
> the external configs). We can't use regular configuration to configure the 
> config loader (chicken/egg).
> I propose to use a new environment variable HADOOP_CONF_SOURCE
> The environment variable could contain a URL, where the schema of the url can 
> define the config source and all the other parts can configure the access to 
> the resource.
> Examples:
> HADOOP_CONF_SOURCE=hadoop-[http://namenode:9878/conf]
> HADOOP_CONF_SOURCE=env://prefix
> HADOOP_CONF_SOURCE=k8s://config-map-name
> The ConfigurationSource interface can be as easy as:
> {code:java}
> /**
>  * Interface to load hadoop configuration from custom location.
>  */
> public interface ConfigurationSource {
>   /**
>    * Method will be called one with the defined configuration url.
>    *
>    * @param uri
>    */
>   void initialize(URI uri) throws IOException;
>   /**
>    * Method will be called to load a specific configuration resource.
>    *
>    * @param name of the configuration resource (eg. hdfs-site.xml)
>    * @return List of loaded configuraiton key and values.
>    */
>   List readConfiguration(String name);
> }{code}
> We can choose the right implementation based the schema of the uri and with 
> Java Service Provider Interface mechanism 
> (META-INF/services/org.apache.hadoop.conf.ConfigurationSource)
> It could be with minimal modification in the Configuration.java (see the 
> attached patch as an example)
>  The patch contains two example implementation:
> *hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/Env.java*
> This can load configuration from environment variables based on a naming 
> convention (eg. HDFS-SITE.XML_hdfs.dfs.key=value)
> *hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/location/HadoopWeb.java*
>  This implementation can load the configuration from a /conf servlet of any 
> Hadoop components.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14898) Create official Docker images for development and testing features

2019-01-22 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748677#comment-16748677
 ] 

Elek, Marton commented on HADOOP-14898:
---

Created a follow-up HADOOP-16063. Please check if you are interested.

> Create official Docker images for development and testing features 
> ---
>
> Key: HADOOP-14898
> URL: https://issues.apache.org/jira/browse/HADOOP-14898
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-14898.001.tar.gz, HADOOP-14898.002.tar.gz, 
> HADOOP-14898.003.tgz, docker_design.pdf
>
>
> This is the original mail from the mailing list:
> {code}
> TL;DR: I propose to create official hadoop images and upload them to the 
> dockerhub.
> GOAL/SCOPE: I would like improve the existing documentation with easy-to-use 
> docker based recipes to start hadoop clusters with various configuration.
> The images also could be used to test experimental features. For example 
> ozone could be tested easily with these compose file and configuration:
> https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6
> Or even the configuration could be included in the compose file:
> https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml
> I would like to create separated example compose files for federation, ha, 
> metrics usage, etc. to make it easier to try out and understand the features.
> CONTEXT: There is an existing Jira 
> https://issues.apache.org/jira/browse/HADOOP-13397
> But it’s about a tool to generate production quality docker images (multiple 
> types, in a flexible way). If no objections, I will create a separated issue 
> to create simplified docker images for rapid prototyping and investigating 
> new features. And register the branch to the dockerhub to create the images 
> automatically.
> MY BACKGROUND: I am working with docker based hadoop/spark clusters quite a 
> while and run them succesfully in different environments (kubernetes, 
> docker-swarm, nomad-based scheduling, etc.) My work is available from here: 
> https://github.com/flokkr but they could handle more complex use cases (eg. 
> instrumenting java processes with btrace, or read/reload configuration from 
> consul).
>  And IMHO in the official hadoop documentation it’s better to suggest to use 
> official apache docker images and not external ones (which could be changed).
> {code}
> The next list will enumerate the key decision points regarding to docker 
> image creating
> A. automated dockerhub build  / jenkins build
> Docker images could be built on the dockerhub (a branch pattern should be 
> defined for a github repository and the location of the Docker files) or 
> could be built on a CI server and pushed.
> The second one is more flexible (it's more easy to create matrix build, for 
> example)
> The first one had the advantage that we can get an additional flag on the 
> dockerhub that the build is automated (and built from the source by the 
> dockerhub).
> The decision is easy as ASF supports the first approach: (see 
> https://issues.apache.org/jira/browse/INFRA-12781?focusedCommentId=15824096=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15824096)
> B. source: binary distribution or source build
> The second question is about creating the docker image. One option is to 
> build the software on the fly during the creation of the docker image the 
> other one is to use the binary releases.
> I suggest to use the second approach as:
> 1. In that case the hadoop:2.7.3 could contain exactly the same hadoop 
> distrubution as the downloadable one
> 2. We don't need to add development tools to the image, the image could be 
> more smaller (which is important as the goal for this image to getting 
> started as fast as possible)
> 3. The docker definition will be more simple (and more easy to maintain)
> Usually this approach is used in other projects (I checked Apache Zeppelin 
> and Apache Nutch)
> C. branch usage
> Other question is the location of the Docker file. It could be on the 
> official source-code branches (branch-2, trunk, etc.) or we can create 
> separated branches for the dockerhub (eg. docker/2.7 docker/2.8 docker/3.0)
> For the first approach it's easier to find the docker images, but it's less 
> flexible. For example if we had a Dockerfile for on the source code it should 
> be used for every release (for example the Docker file from the tag 
> release-3.0.0 should be used for the 3.0 hadoop docker image). In that case 
> the release process is much more harder: in case of a Dockerfile error (which 
> could be test on dockerhub only after the taging), a new release should be 
> 

[jira] [Updated] (HADOOP-15259) Provide docker file for the development builds

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15259:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Closing it as of now as there was no reviewer. Created HADOOP-16063 to do 
similar works.

> Provide docker file for the development builds
> --
>
> Key: HADOOP-15259
> URL: https://issues.apache.org/jira/browse/HADOOP-15259
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15259.001.patch
>
>
> An other use case for using docker image is creating custom docker image 
> (base image + custom hadoop build). The custom image could be used to test 
> easily the hadoop build on external dockerized cluster (eg. Kubernetes)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15257) Provide example docker compose file for developer builds

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15257:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Closing it as of now as there was no reviewer. Created HADOOP-16063 to do 
similar works.

> Provide example docker compose file for developer builds
> 
>
> Key: HADOOP-15257
> URL: https://issues.apache.org/jira/browse/HADOOP-15257
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15257.001.patch
>
>
> This issue is about creating example docker-compose files which use the 
> latest build from the hadoop-dist directory.
> These docker-compose files would help to run a specific hadoop cluster based 
> on the latest custom build without the need to build customized docker image 
> (with mounting hadoop fro hadoop-dist to the container



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15258) Create example docker-compose file for documentations

2019-01-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15258:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Closing it as of now as there was no reviewer. Created HADOOP-16063 to do 
similar works.

> Create example docker-compose file for documentations
> -
>
> Key: HADOOP-15258
> URL: https://issues.apache.org/jira/browse/HADOOP-15258
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15258.001.patch
>
>
> An other user case for docker is to use it in the documentation. For example 
> in the HA documentation we can provide an example docker-compose file and 
> configuration with all the required settings to getting started easily with 
> an HA cluster.
> 1. I would add an example to a documetation page
> 2. It will use the hadoop3 image (which contains latest hadoop3) as the user 
> of the documentation may not build a hadoop



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16063) Docker based pseudo-cluster definitions and test scripts for Hdfs/Yarn

2019-01-22 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16063:
-

 Summary: Docker based pseudo-cluster definitions and test scripts 
for Hdfs/Yarn
 Key: HADOOP-16063
 URL: https://issues.apache.org/jira/browse/HADOOP-16063
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Elek, Marton


During the recent releases of Apache Hadoop Ozone we had multiple experiments 
using docker/docker-compose to support the development of ozone.

As of now the hadoop-ozone distribution contains two directories in additional 
the regular hadoop directories (bin, share/lib, etc
h3. compose

The ./compose directory of the distribution contains different type of 
pseudo-cluster definitions. To start an ozone cluster is as easy as "cd 
compose/ozone && docker-compose up-d"

The clusters also could be scaled up and down (docker-compose scale datanode=3)

There are multiple cluster definitions for different use cases (for example 
ozone+s3 or hdfs+ozone).

The docker-compose files are based on apache/hadoop-runner image which is an 
"empty" image. It doesnt' contain any hadoop distribution. Instead the current 
hadoop is used (the ../.. is mapped as a volume at /opt/hadoop)

With this approach it's very easy to 1) start a cluster from the distribution 
2) test any patch from the dev tree, as after any build a new cluster can be 
started easily (with multiple nodes and datanodes)
h3. smoketest

We also started to use a simple robotframework based test suite. (see 
./smoketest directory). It's a high level test definition very similar to the 
smoketests which are executed manually by the contributors during a release 
vote.

But it's a formal definition to start cluster from different docker-compose 
definitions and execute simple shell scripts (and compare the output).

 

I believe that both approaches helped a lot during the development of ozone and 
I propose to do the same improvements on the main hadoop distribution.

I propose to provide docker-compose based example cluster definitions for 
yarn/hdfs and for different use cases (simple hdfs, router based federation, 
etc.)

It can help to understand the different configuration and try out new features 
with predefined config set.

Long term we can also add robottests to help the release votes (basic 
wordcount/mr tests could be scripted)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15205) maven release: missing source attachments for hadoop-mapreduce-client-core

2019-01-11 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740875#comment-16740875
 ] 

Elek, Marton commented on HADOOP-15205:
---

As I see in the source the -Pdist profile is required for the source upload:

[https://github.com/apache/hadoop/blob/bf08f4abae43d706a305af3f14e00f01c00dba7c/hadoop-project/pom.xml#L1932]

I modified the [release 
guideline|https://wiki.apache.org/hadoop/HowToRelease#preview] to include dist 
profile in the the mvn deploy command execution.
Don't know what is the plan to fix older releases.

 

> maven release: missing source attachments for hadoop-mapreduce-client-core
> --
>
> Key: HADOOP-15205
> URL: https://issues.apache.org/jira/browse/HADOOP-15205
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 2.8.2, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 3.0.1, 2.8.4, 
> 2.9.2, 2.8.5
>Reporter: Zoltan Haindrich
>Priority: Major
> Attachments: chk.bash
>
>
> I wanted to use the source attachment; however it looks like since 2.7.5 that 
> artifact is not present at maven central ; it looks like the last release 
> which had source attachments / javadocs was 2.7.4
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.4/
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.5/
> this seems to be not limited to mapreduce; as the same change is present for 
> yarn-common as well
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.4/
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.5/
> and also hadoop-common
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.4/
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.5/
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/3.0.0/
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/3.1.0/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2019-01-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732160#comment-16732160
 ] 

Elek, Marton commented on HADOOP-16003:
---

INFRA ticket is created: INFRA-17526

> Migrate the Hadoop jenkins jobs to use new gitbox urls
> --
>
> Key: HADOOP-16003
> URL: https://issues.apache.org/jira/browse/HADOOP-16003
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> As it's announced the INFRA team all the apache git repositories will be 
> migrated to use gitbox. I created this jira to sync on the required steps to 
> update the jenkins job, and record the changes.
> By default it could be as simple as changing the git url for all the jenkins 
> jobs under the Hadoop view:
> https://builds.apache.org/view/H-L/view/Hadoop/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-30 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reassigned HADOOP-16003:
-

Assignee: Elek, Marton

> Migrate the Hadoop jenkins jobs to use new gitbox urls
> --
>
> Key: HADOOP-16003
> URL: https://issues.apache.org/jira/browse/HADOOP-16003
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> As it's announced the INFRA team all the apache git repositories will be 
> migrated to use gitbox. I created this jira to sync on the required steps to 
> update the jenkins job, and record the changes.
> By default it could be as simple as changing the git url for all the jenkins 
> jobs under the Hadoop view:
> https://builds.apache.org/view/H-L/view/Hadoop/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-30 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731085#comment-16731085
 ] 

Elek, Marton commented on HADOOP-16003:
---

I updated all the jenkins jobs under the hadoop view and ozone view:

https://builds.apache.org/view/H-L/view/Hadoop/
https://builds.apache.org/view/O/view/Ozone%20(Hadoop%20Ozone)/ 

Except the beam_PerformanceTests_* jobs.

I updated only the git urls at the beginning of the jenkins jobs.

I started a new trunk build:

https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-Commit/15671/

It's in progress but seems to be working.

And reuploaded the patch of HADOOP-15996 to get a precommit result.

This is also an example for precommit + gitbox:

https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/15713/console


> Migrate the Hadoop jenkins jobs to use new gitbox urls
> --
>
> Key: HADOOP-16003
> URL: https://issues.apache.org/jira/browse/HADOOP-16003
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Elek, Marton
>Priority: Major
>
> As it's announced the INFRA team all the apache git repositories will be 
> migrated to use gitbox. I created this jira to sync on the required steps to 
> update the jenkins job, and record the changes.
> By default it could be as simple as changing the git url for all the jenkins 
> jobs under the Hadoop view:
> https://builds.apache.org/view/H-L/view/Hadoop/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

2018-12-30 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731083#comment-16731083
 ] 

Elek, Marton commented on HADOOP-15996:
---

I uploaded again the number 10 patch as 11 without any modification to test the 
git url migration.  (This was the most recent, good patch.)

Sorry for the additional noise (an additional precommit report is expected here 
soon...)

> Plugin interface to support more complex usernames in Hadoop
> 
>
> Key: HADOOP-15996
> URL: https://issues.apache.org/jira/browse/HADOOP-15996
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Eric Yang
>Assignee: Bolke de Bruin
>Priority: Major
> Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0001-Make-allowing-or-configurable.patch, 
> 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, 
> 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0003-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0004-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0005-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> HADOOP-15996.0005.patch, HADOOP-15996.0006.patch, HADOOP-15996.0007.patch, 
> HADOOP-15996.0008.patch, HADOOP-15996.0009.patch, HADOOP-15996.0010.patch, 
> HADOOP-15996.0011.patch
>
>
> Hadoop does not allow support of @ character in username in recent security 
> mailing list vote to revert HADOOP-12751.  Hadoop auth_to_local rule must 
> match to authorize user to login to Hadoop cluster.  This design does not 
> work well in multi-realm environment where identical username between two 
> realms do not map to the same user.  There is also possibility that lossy 
> regex can incorrectly map users.  In the interest of supporting multi-realms, 
> it maybe preferred to pass principal name without rewrite to uniquely 
> distinguish users.  This jira is to revisit if Hadoop can support full 
> principal names without rewrite and provide a plugin to override Hadoop's 
> default implementation of auth_to_local for multi-realm use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

2018-12-30 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15996:
--
Attachment: HADOOP-15996.0011.patch

> Plugin interface to support more complex usernames in Hadoop
> 
>
> Key: HADOOP-15996
> URL: https://issues.apache.org/jira/browse/HADOOP-15996
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Eric Yang
>Assignee: Bolke de Bruin
>Priority: Major
> Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0001-Make-allowing-or-configurable.patch, 
> 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, 
> 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0003-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0004-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0005-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> HADOOP-15996.0005.patch, HADOOP-15996.0006.patch, HADOOP-15996.0007.patch, 
> HADOOP-15996.0008.patch, HADOOP-15996.0009.patch, HADOOP-15996.0010.patch, 
> HADOOP-15996.0011.patch
>
>
> Hadoop does not allow support of @ character in username in recent security 
> mailing list vote to revert HADOOP-12751.  Hadoop auth_to_local rule must 
> match to authorize user to login to Hadoop cluster.  This design does not 
> work well in multi-realm environment where identical username between two 
> realms do not map to the same user.  There is also possibility that lossy 
> regex can incorrectly map users.  In the interest of supporting multi-realms, 
> it maybe preferred to pass principal name without rewrite to uniquely 
> distinguish users.  This jira is to revisit if Hadoop can support full 
> principal names without rewrite and provide a plugin to override Hadoop's 
> default implementation of auth_to_local for multi-realm use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16009) Replace the url of the repository in Apache Hadoop source code

2018-12-30 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731081#comment-16731081
 ] 

Elek, Marton commented on HADOOP-16009:
---

+1 LGTM.

The migration has been finished. I am updating the jenkins jobs right now.

I will upload this patch again to test the precommit build and commit after 
that. 

> Replace the url of the repository in Apache Hadoop source code
> --
>
> Key: HADOOP-16009
> URL: https://issues.apache.org/jira/browse/HADOOP-16009
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 2.10.0, 2.7.8, 3.0.4, 3.3.0, 3.1.2, 2.8.6, 3.2.1, 2.9.3
>
> Attachments: HADOOP-16009.01.patch
>
>
> This issue is for the source code change in Apache Hadoop repository.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HADOOP-16009) Replace the url of the repository in Apache Hadoop source code

2018-12-30 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-16009:
--
Comment: was deleted

(was: +1 LGTM.

The migration has been finished. I am updating the jenkins jobs right now.

I will upload this patch again to test the precommit build and commit after 
that. )

> Replace the url of the repository in Apache Hadoop source code
> --
>
> Key: HADOOP-16009
> URL: https://issues.apache.org/jira/browse/HADOOP-16009
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 2.10.0, 2.7.8, 3.0.4, 3.3.0, 3.1.2, 2.8.6, 3.2.1, 2.9.3
>
> Attachments: HADOOP-16009.01.patch
>
>
> This issue is for the source code change in Apache Hadoop repository.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-20 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725705#comment-16725705
 ] 

Elek, Marton commented on HADOOP-16003:
---

Anytime is good for me. I propose to do it on the weekend of 29/30 of Dec. The 
commit bandwidth are usually decreased because the holidays of the western 
world and we have time to announce it on the mailing list. If 29/30 doesn't 
work for the INFRA 31/12 is also ok. I am in UTC+1 and the best time is the 
night. So my proposal is 21PM UTC of 30th of December if it's fine with you and 
INFRA.

> Migrate the Hadoop jenkins jobs to use new gitbox urls
> --
>
> Key: HADOOP-16003
> URL: https://issues.apache.org/jira/browse/HADOOP-16003
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Elek, Marton
>Priority: Major
>
> As it's announced the INFRA team all the apache git repositories will be 
> migrated to use gitbox. I created this jira to sync on the required steps to 
> update the jenkins job, and record the changes.
> By default it could be as simple as changing the git url for all the jenkins 
> jobs under the Hadoop view:
> https://builds.apache.org/view/H-L/view/Hadoop/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16003) Migrate the Hadoop jenkins jobs to use new gitbox urls

2018-12-13 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-16003:
-

 Summary: Migrate the Hadoop jenkins jobs to use new gitbox urls
 Key: HADOOP-16003
 URL: https://issues.apache.org/jira/browse/HADOOP-16003
 Project: Hadoop Common
  Issue Type: Task
Reporter: Elek, Marton


As it's announced the INFRA team all the apache git repositories will be 
migrated to use gitbox. I created this jira to sync on the required steps to 
update the jenkins job, and record the changes.

By default it could be as simple as changing the git url for all the jenkins 
jobs under the Hadoop view:

https://builds.apache.org/view/H-L/view/Hadoop/




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15566) Remove HTrace support

2018-12-10 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714435#comment-16714435
 ] 

Elek, Marton commented on HADOOP-15566:
---

Thanks [~cmccabe], I agree with your points about the importance of the 
compatibility and to keep the htrace support.

My proposal is:

1.) Create a lightweight Hadoop API for the tracing where multiple 
implementation can be plugged in

2.) Provide a default implementation which uses the existing htrace code.

Implementation details:

a) Add a new optional bytes field for the RpcHeader. Different tracing 
libraries could require different size of serialized context:
{code:java}
diff --git a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto 
b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
index aa146162896..e42f64eb631 100644
--- a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
+++ b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
@@ -61,9 +61,9 @@ enum RpcKindProto {
  * what span caused the new span we will create when this message is received.
  */
 message RPCTraceInfoProto {
optional int64 traceId = 1; // parentIdHigh
optional int64 parentId = 2; // parentIdLow

+optional bytes tracingContext = 3; //generic tracingInformation
 }
{code}
This is a a backward-compatible change.

b) In the rpc Server.java a (htrace) TraceScope is initialized based on the rpc 
header and propagated as part of the RpcCall:
{code:java}
  RpcCall call = new RpcCall(this, header.getCallId(),
  header.getRetryCount(), rpcRequest,
  ProtoUtil.convert(header.getRpcKind()),
  header.getClientId().toByteArray(), traceScope, callerContext);
{code}
I propose to replace this traceScope with a hadoop specific TraceScope marker 
interface. The default implementation could be a simple class which contains 
the htrace implementation.

c. We can create a simple Tracing singleton (similar to the 
DefaultMetricsSystem):

Example call:
{code:java}
  try (TracingSpan context = 
HadoopTracing.INSTANCE.newContext(call.tracingSpan, "RpcServerCall")) {
if (remoteUser != null) {
  remoteUser.doAs(call);
} else {
  call.run();
}
}
{code}
d. HadoopTracing could be something like this:
{code:java}
package org.apache.hadoop.tracing;

public enum HadoopTracing {
  INSTANCE;

  private TracingProvider provider;

  public TracingSpan importContext(byte[] data) {
return provider.importContext(data);
  }

  public byte[] exportContext() {
return provider.exportContext();
  }

  public TracingSpan newContext(String name) {
return provider.newContext(name);
  }

  public TracingSpan newContext(TracingSpan parentSpan, String name) {
return null;
  }
}
{code}
e. We can add multiple TracingProvider (and provide one for Htrace for 
compatibility reason.)

+1. Personally I prefer to use some utility which adds trace support to 
specific methods which are annotated. It could simplify the usage of the 
tracing but requires java proxy. But this is an independent question.

> Remove HTrace support
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: security
> Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15256) Create docker images for latest stable hadoop3 build

2018-12-06 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15256:
--
  Resolution: Fixed
Target Version/s:   (was: 3.3.0)
  Status: Resolved  (was: Patch Available)

I can confirm that it's working with the latest patches from HDDS-524. The 
apache/hadoop:3 image is used in the smoke tests of the Hadoop Ozone. Closing 
this issue.

> Create docker images for latest stable hadoop3 build
> 
>
> Key: HADOOP-15256
> URL: https://issues.apache.org/jira/browse/HADOOP-15256
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15256-docker-hadoop-3.001.patch, 
> HADOOP-15256-docker-hadoop-3.003.patch
>
>
> Similar to the hadoop2 image we can provide a developer hadoop image which 
> contains the latest hadoop from the binary release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-11-23 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15358:
--
Fix Version/s: 3.3.0

> SFTPConnectionPool connections leakage
> --
>
> Key: HADOOP-15358
> URL: https://issues.apache.org/jira/browse/HADOOP-15358
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Mikhail Pryakhin
>Assignee: Mikhail Pryakhin
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HADOOP-15358.001.patch
>
>
> Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus 
> some methods of SFTPFileSystem are chained together resulting in establishing 
> multiple connections to the SFTP server to accomplish one compound action, 
> those methods are listed below:
>  # mkdirs method
> the public mkdirs method acquires a new ChannelSftp from the pool [1]
> and then recursively creates directories, checking for the directory 
> existence beforehand by calling the method exists[2] which delegates to the 
> getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
> ends up in returning the FilesStatus instance [4]. The resource leakage 
> occurs in the method getWorkingDirectory which calls the getHomeDirectory 
> method [5] which in turn establishes a new connection to the sftp server 
> instead of using an already created connection. As the mkdirs method is 
> recursive this results in creating a huge number of connections.
>  # open method [6]. This method returns an instance of FSDataInputStream 
> which consumes SFTPInputStream instance which doesn't return an acquired 
> ChannelSftp instance back to the pool but instead it closes it[7]. This leads 
> to establishing another connection to an SFTP server when the next method is 
> called on the FileSystem instance.
> [1] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658
> [2] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321
> [3] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202
> [4] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290
> [5] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640
> [6] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504
> [7] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-11-23 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15358:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to the trunk. Thanks the contribution and sorry for the delayed 
review.

(I fixed the unused import + line length checkstyle issues during the commit)

> SFTPConnectionPool connections leakage
> --
>
> Key: HADOOP-15358
> URL: https://issues.apache.org/jira/browse/HADOOP-15358
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Mikhail Pryakhin
>Assignee: Mikhail Pryakhin
>Priority: Critical
> Attachments: HADOOP-15358.001.patch
>
>
> Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus 
> some methods of SFTPFileSystem are chained together resulting in establishing 
> multiple connections to the SFTP server to accomplish one compound action, 
> those methods are listed below:
>  # mkdirs method
> the public mkdirs method acquires a new ChannelSftp from the pool [1]
> and then recursively creates directories, checking for the directory 
> existence beforehand by calling the method exists[2] which delegates to the 
> getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
> ends up in returning the FilesStatus instance [4]. The resource leakage 
> occurs in the method getWorkingDirectory which calls the getHomeDirectory 
> method [5] which in turn establishes a new connection to the sftp server 
> instead of using an already created connection. As the mkdirs method is 
> recursive this results in creating a huge number of connections.
>  # open method [6]. This method returns an instance of FSDataInputStream 
> which consumes SFTPInputStream instance which doesn't return an acquired 
> ChannelSftp instance back to the pool but instead it closes it[7]. This leads 
> to establishing another connection to an SFTP server when the next method is 
> called on the FileSystem instance.
> [1] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658
> [2] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321
> [3] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202
> [4] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290
> [5] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640
> [6] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504
> [7] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-11-23 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16696528#comment-16696528
 ] 

Elek, Marton commented on HADOOP-15358:
---

+1. It looks good to me.

Very precise problem definition and clean implementation with unit test. It's 
backward compatible as the old methods are still there (which opens the new 
connections).

Unit test is passed + I tested it with 'dfs ls' and it worked well. I also 
noticed the recursive behaviour with java debug.

Will commit it to the trunk soon... 

> SFTPConnectionPool connections leakage
> --
>
> Key: HADOOP-15358
> URL: https://issues.apache.org/jira/browse/HADOOP-15358
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Mikhail Pryakhin
>Assignee: Mikhail Pryakhin
>Priority: Critical
> Attachments: HADOOP-15358.001.patch
>
>
> Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus 
> some methods of SFTPFileSystem are chained together resulting in establishing 
> multiple connections to the SFTP server to accomplish one compound action, 
> those methods are listed below:
>  # mkdirs method
> the public mkdirs method acquires a new ChannelSftp from the pool [1]
> and then recursively creates directories, checking for the directory 
> existence beforehand by calling the method exists[2] which delegates to the 
> getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
> ends up in returning the FilesStatus instance [4]. The resource leakage 
> occurs in the method getWorkingDirectory which calls the getHomeDirectory 
> method [5] which in turn establishes a new connection to the sftp server 
> instead of using an already created connection. As the mkdirs method is 
> recursive this results in creating a huge number of connections.
>  # open method [6]. This method returns an instance of FSDataInputStream 
> which consumes SFTPInputStream instance which doesn't return an acquired 
> ChannelSftp instance back to the pool but instead it closes it[7]. This leads 
> to establishing another connection to an SFTP server when the next method is 
> called on the FileSystem instance.
> [1] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658
> [2] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321
> [3] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202
> [4] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290
> [5] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640
> [6] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504
> [7] 
> https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15367) Update the initialization code in the docker hadoop-runner baseimage

2018-11-06 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676681#comment-16676681
 ] 

Elek, Marton commented on HADOOP-15367:
---

It's committed to the docker-hadoop-runner branch. I removed the fixed version. 
Sorry for the confusion.

> Update the initialization code in the docker hadoop-runner baseimage 
> -
>
> Key: HADOOP-15367
> URL: https://issues.apache.org/jira/browse/HADOOP-15367
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15367-hadoop-docker-runner.001.patch, 
> HADOOP-15367-hadoop-docker-runner.002.patch
>
>
> The hadoop-runner baseimage contains initialization code for both the HDFS 
> namenode/datanode and Ozone/Hdds scm/ksm.
> The script name for Ozone/Hdds is changed (from oz to ozone) therefore we 
> need to updated the base image.
> This commit also would be a test for the dockerhub automated build.
> Please apply the patch on the top of the _docker-hadoop-runner_ branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15367) Update the initialization code in the docker hadoop-runner baseimage

2018-11-06 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15367:
--
Fix Version/s: (was: 3.2.0)

> Update the initialization code in the docker hadoop-runner baseimage 
> -
>
> Key: HADOOP-15367
> URL: https://issues.apache.org/jira/browse/HADOOP-15367
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15367-hadoop-docker-runner.001.patch, 
> HADOOP-15367-hadoop-docker-runner.002.patch
>
>
> The hadoop-runner baseimage contains initialization code for both the HDFS 
> namenode/datanode and Ozone/Hdds scm/ksm.
> The script name for Ozone/Hdds is changed (from oz to ozone) therefore we 
> need to updated the base image.
> This commit also would be a test for the dockerhub automated build.
> Please apply the patch on the top of the _docker-hadoop-runner_ branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-11-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672886#comment-16672886
 ] 

Elek, Marton commented on HADOOP-15339:
---

Thanks [~anu]. Committed to the branch-3.1. The next 3.1.x release (3.1.2) will 
be compatible with the ozone ui features.

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339-branch-3.1.004.patch, 
> HADOOP-15339.001.patch, HADOOP-15339.002.patch, HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-11-02 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15339:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339-branch-3.1.004.patch, 
> HADOOP-15339.001.patch, HADOOP-15339.002.patch, HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15339:
--
Attachment: HADOOP-15339-branch-3.1.004.patch

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339-branch-3.1.004.patch, 
> HADOOP-15339.001.patch, HADOOP-15339.002.patch, HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15339:
--
Status: Patch Available  (was: Reopened)

The branch could be cherry-picked but I re-uploaded the patch to get an actual 
jenkins response.

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339-branch-3.1.004.patch, 
> HADOOP-15339.001.patch, HADOOP-15339.002.patch, HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15339:
--
Target Version/s: 3.2.0, 3.1.2  (was: 3.2.0)

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339-branch-3.1.004.patch, 
> HADOOP-15339.001.patch, HADOOP-15339.002.patch, HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-15339) Support additional key/value propereties in JMX bean registration

2018-10-29 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reopened HADOOP-15339:
---

Since the commit we use this change from ozone/hdds and it worked well.

This change is required to have a working ozone/hdds webui as the shared code 
path tags the common jmx beans with generic key/value tags.

I reopen this issue and propose to backport it to branch-3.1 to make it easier 
to use hdds/ozone with older hadoop versions.
 # It's a small change
 # Backward compatible
 # Safe to use (no issue during the last 6 months)
 # No conflicts for cherry-pick.

 

> Support additional key/value propereties in JMX bean registration
> -
>
> Key: HADOOP-15339
> URL: https://issues.apache.org/jira/browse/HADOOP-15339
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15339.001.patch, HADOOP-15339.002.patch, 
> HADOOP-15339.003.patch
>
>
> org.apache.hadoop.metrics2.util.MBeans.register is a utility function to 
> register objects to the JMX registry with a given name prefix and name.
> JMX supports any additional key value pairs which could be part the the 
> address of the jmx bean. For example: 
> _java.lang:type=MemoryManager,name=CodeCacheManager_
> Using this method we can query a group of mbeans, for example we can add the 
> same tag to similar mbeans from namenode and datanode.
> This patch adds a small modification to support custom key value pairs and 
> also introduce a new unit test for MBeans utility which was missing until now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651730#comment-16651730
 ] 

Elek, Marton commented on HADOOP-15857:
---

Oh, thanks [~jnp], you are right. Sorry, I missed it. I uploaded the addendum 
patch. Will commit it if no objections.

> Remove ozonefs class name definition from core-default.xml
> --
>
> Key: HADOOP-15857
> URL: https://issues.apache.org/jira/browse/HADOOP-15857
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: 3.2.0
>
> Attachments: HADOOP-15857-branch-3.2.001.patch, 
> HADOOP-15857-branch-3.2.addendum.patch
>
>
> Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
> branch-3.2 still contains a reference with o3://.
> The easiest way to fix it just remove the fs.o3.imp definition from 
> core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
> registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15857:
--
Attachment: HADOOP-15857-branch-3.2.addendum.patch

> Remove ozonefs class name definition from core-default.xml
> --
>
> Key: HADOOP-15857
> URL: https://issues.apache.org/jira/browse/HADOOP-15857
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: 3.2.0
>
> Attachments: HADOOP-15857-branch-3.2.001.patch, 
> HADOOP-15857-branch-3.2.addendum.patch
>
>
> Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
> branch-3.2 still contains a reference with o3://.
> The easiest way to fix it just remove the fs.o3.imp definition from 
> core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
> registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651695#comment-16651695
 ] 

Elek, Marton commented on HADOOP-15857:
---

Thanks you very much [~sunilg] to include it in the release at the last 
minute...

> Remove ozonefs class name definition from core-default.xml
> --
>
> Key: HADOOP-15857
> URL: https://issues.apache.org/jira/browse/HADOOP-15857
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: 3.2.0
>
> Attachments: HADOOP-15857-branch-3.2.001.patch
>
>
> Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
> branch-3.2 still contains a reference with o3://.
> The easiest way to fix it just remove the fs.o3.imp definition from 
> core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
> registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15857:
--
Status: Patch Available  (was: Open)

> Remove ozonefs class name definition from core-default.xml
> --
>
> Key: HADOOP-15857
> URL: https://issues.apache.org/jira/browse/HADOOP-15857
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15857-branch-3.2.001.patch
>
>
> Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
> branch-3.2 still contains a reference with o3://.
> The easiest way to fix it just remove the fs.o3.imp definition from 
> core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
> registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15857:
--
Attachment: HADOOP-15857-branch-3.2.001.patch

> Remove ozonefs class name definition from core-default.xml
> --
>
> Key: HADOOP-15857
> URL: https://issues.apache.org/jira/browse/HADOOP-15857
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15857-branch-3.2.001.patch
>
>
> Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
> branch-3.2 still contains a reference with o3://.
> The easiest way to fix it just remove the fs.o3.imp definition from 
> core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
> registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15857) Remove ozonefs class name definition from core-default.xml

2018-10-16 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15857:
-

 Summary: Remove ozonefs class name definition from core-default.xml
 Key: HADOOP-15857
 URL: https://issues.apache.org/jira/browse/HADOOP-15857
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Elek, Marton
Assignee: Elek, Marton


Ozone file system is under renaming in HDDS-651 from o3:// to o3fs://. But 
branch-3.2 still contains a reference with o3://.

The easiest way to fix it just remove the fs.o3.imp definition from 
core-default.xml from branch-3.2 as since HDDS-654 the file system could be 
registered with Service Provider Interface (META-INF/services...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-10-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638396#comment-16638396
 ] 

Elek, Marton commented on HADOOP-15791:
---

+1. I tested it again. Built it and started HDFS cluster successfully. Should 
be good enough.

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791-branch-3.2.002.patch, HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-10-04 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637838#comment-16637838
 ] 

Elek, Marton commented on HADOOP-15791:
---

I think this is the limitation of yetus. It collects the required modules to 
build (first) and applies the patch (after). I just deleted a lot of projects.

I think it's safe to ignore all of the ./hadoop-ozone and ./hadoop-hdds 
findbugs/asf errors. Just make sure that you have no hadoop-ozone and 
hadoop-hdds directories after applying the patch.

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791-branch-3.2.002.patch, HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-10-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636262#comment-16636262
 ] 

Elek, Marton commented on HADOOP-15791:
---

Rebased on top of the fresh branch-3.2...

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791-branch-3.2.002.patch, HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-10-02 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15791:
--
Attachment: HADOOP-15791-branch-3.2.002.patch

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791-branch-3.2.002.patch, HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15805) Hadoop logo not showed correctly in old site

2018-10-02 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15805:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Just committed to the asf-site branch of hadoop-site repository. The url is 
working again: https://hadoop.apache.org/images/hadoop-logo.jpg

Thanks [~linyiqun] the report and [~Sandeep Nemuri] the patch.

> Hadoop logo not showed correctly in old site
> 
>
> Key: HADOOP-15805
> URL: https://issues.apache.org/jira/browse/HADOOP-15805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.1.1
>Reporter: Yiqun Lin
>Assignee: Sandeep Nemuri
>Priority: Major
> Attachments: Error-page.jpg, HADOOP-15805.001.patch
>
>
> Hadoop logo not showed correctly in old site.  In old site pages, we use the 
> address {{[http://hadoop.apache.org/images/hadoop-logo.jpg]}} to show the 
> hadoop logo. Actually, this address is outdated. The right address now is 
> [http://hadoop.apache.org/hadoop-logo.jpg].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15805) Hadoop logo not showed correctly in old site

2018-10-02 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635385#comment-16635385
 ] 

Elek, Marton commented on HADOOP-15805:
---

+1. Let's try to be backward compatible, when possible. There is no cost of 
keeping the image in the old location.


> Hadoop logo not showed correctly in old site
> 
>
> Key: HADOOP-15805
> URL: https://issues.apache.org/jira/browse/HADOOP-15805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.1.1
>Reporter: Yiqun Lin
>Assignee: Sandeep Nemuri
>Priority: Major
> Attachments: Error-page.jpg, HADOOP-15805.001.patch
>
>
> Hadoop logo not showed correctly in old site.  In old site pages, we use the 
> address {{[http://hadoop.apache.org/images/hadoop-logo.jpg]}} to show the 
> hadoop logo. Actually, this address is outdated. The right address now is 
> [http://hadoop.apache.org/hadoop-logo.jpg].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15805) Hadoop logo not showed correctly in old site

2018-10-01 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633965#comment-16633965
 ] 

Elek, Marton commented on HADOOP-15805:
---

Unfortunately I can't give you permission.  [~anu] or [~arpiagariu] can do it. 
(AFAIK)

> Hadoop logo not showed correctly in old site
> 
>
> Key: HADOOP-15805
> URL: https://issues.apache.org/jira/browse/HADOOP-15805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.1.1
>Reporter: Yiqun Lin
>Priority: Major
> Attachments: Error-page.jpg
>
>
> Hadoop logo not showed correctly in old site.  In old site pages, we use the 
> address {{[http://hadoop.apache.org/images/hadoop-logo.jpg]}} to show the 
> hadoop logo. Actually, this address is outdated. The right address now is 
> [http://hadoop.apache.org/hadoop-logo.jpg].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15805) Hadoop logo not showed correctly in old site

2018-09-30 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633380#comment-16633380
 ] 

Elek, Marton commented on HADOOP-15805:
---

Sure, let's copy/move the logo to the right location inside static/ to get 
exactly the same url as before.

[~Sandeep Nemuri] Are you working on a patch? I can review/commit if you have 
one...

> Hadoop logo not showed correctly in old site
> 
>
> Key: HADOOP-15805
> URL: https://issues.apache.org/jira/browse/HADOOP-15805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.1.1
>Reporter: Yiqun Lin
>Priority: Major
> Attachments: Error-page.jpg
>
>
> Hadoop logo not showed correctly in old site.  In old site pages, we use the 
> address {{[http://hadoop.apache.org/images/hadoop-logo.jpg]}} to show the 
> hadoop logo. Actually, this address is outdated. The right address now is 
> [http://hadoop.apache.org/hadoop-logo.jpg].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-26 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629633#comment-16629633
 ] 

Elek, Marton commented on HADOOP-15791:
---

Prepared the patch. But please commit it only after the 3.2 branch cut (and 
don't commit it to the trunk).

Will put it to Patch Available state to get jenkins feedback.

I successfully started an hdfs cluster from the distribution, so the classpath 
+ shell scripts should be fine... 


> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-26 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15791:
--
Status: Patch Available  (was: Open)

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-26 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-15791:
--
Attachment: HADOOP-15791.001.patch

> Remove Ozone related sources from the 3.2 branch
> 
>
> Key: HADOOP-15791
> URL: https://issues.apache.org/jira/browse/HADOOP-15791
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-15791.001.patch
>
>
> As it is discussed at HDDS-341 and written in the original proposal of Ozone 
> merge, we can remove all the ozone/hdds projects from the 3.2 release branch.
> {quote}
>  * On trunk (as opposed to release branches) HDSL will be a separate module 
> in Hadoop's source tree. This will enable the HDSL to work on their trunk and 
> the Hadoop trunk without making releases for every change.
>   * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
>   * When Hadoop creates a release branch, the RM will delete the HDSL module 
> from the branch.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15791) Remove Ozone related sources from the 3.2 branch

2018-09-25 Thread Elek, Marton (JIRA)
Elek, Marton created HADOOP-15791:
-

 Summary: Remove Ozone related sources from the 3.2 branch
 Key: HADOOP-15791
 URL: https://issues.apache.org/jira/browse/HADOOP-15791
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


As it is discussed at HDDS-341 and written in the original proposal of Ozone 
merge, we can remove all the ozone/hdds projects from the 3.2 release branch.

{quote}
 * On trunk (as opposed to release branches) HDSL will be a separate module in 
Hadoop's source tree. This will enable the HDSL to work on their trunk and the 
Hadoop trunk without making releases for every change.
  * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
  * When Hadoop creates a release branch, the RM will delete the HDSL module 
from the branch.
{quote}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2018-09-19 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621540#comment-16621540
 ] 

Elek, Marton commented on HADOOP-14163:
---

Sorry for the inconvenience [~djp]. I didn't notice this earlier.

Please try out the following command:

{code}
git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
{code}

(The asf-site is used and we have no master, which could be a problem.)

> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, 
> HADOOP-14163-003.zip, HADOOP-14163.004.patch, HADOOP-14163.005.patch, 
> HADOOP-14163.006.patch, HADOOP-14163.007.patch, HADOOP-14163.008.tar.gz, 
> HADOOP-14163.009.patch, HADOOP-14163.009.tar.gz, HADOOP-14163.010.tar.gz, 
> hadoop-site.tar.gz, hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14164) Update the skin of maven-site during doc generation

2018-09-18 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HADOOP-14164:
--
Labels: newbie  (was: )

> Update the skin of maven-site during doc generation
> ---
>
> Key: HADOOP-14164
> URL: https://issues.apache.org/jira/browse/HADOOP-14164
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, documentation
>Reporter: Elek, Marton
>Priority: Major
>  Labels: newbie
>
> Together with the improvements of the hadoop site (HADOOP-14163), I suggest 
> to improve theme used by the mave-site plugin for all the hadoop 
> documentation.
> One possible option is using the reflow skin:
> http://andriusvelykis.github.io/reflow-maven-skin/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   >