[jira] [Reopened] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-20 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reopened HADOOP-15821:


> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch, HADOOP-15821.006.patch, HADOOP-15821.007.patch, 
> HADOOP-15821.008.patch, HADOOP-15821.009.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657671#comment-16657671
 ] 

Eric Yang commented on HADOOP-15821:


[~elgoiri] Yes, it needs the relativePath tag for correctness.  

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch, HADOOP-15821.006.patch, HADOOP-15821.007.patch, 
> HADOOP-15821.008.patch, HADOOP-15821.009.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657581#comment-16657581
 ] 

Eric Yang commented on HADOOP-15821:


+1 

[~elgoiri] Thank you for the patch.  Patch 9 works as intended.  Unit test 
failures doesn't appear to be related to this patch. I will commit to this to 
trunk shortly.

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch, HADOOP-15821.006.patch, HADOOP-15821.007.patch, 
> HADOOP-15821.008.patch, HADOOP-15821.009.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-19 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15821:
---
Target Version/s: 3.3.0
   Fix Version/s: 3.3.0

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch, HADOOP-15821.006.patch, HADOOP-15821.007.patch, 
> HADOOP-15821.008.patch, HADOOP-15821.009.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-19 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15821:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch, HADOOP-15821.006.patch, HADOOP-15821.007.patch, 
> HADOOP-15821.008.patch, HADOOP-15821.009.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655708#comment-16655708
 ] 

Eric Yang edited comment on HADOOP-15832 at 10/18/18 6:26 PM:
--

When looking at mvn dependency:tree, it shows that:

{code}
[INFO] +- org.apache.hadoop:hadoop-minicluster:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- org.apache.hadoop:hadoop-hdfs:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-yarn-server-tests:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-nodemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  |  \- 
org.apache.hadoop:hadoop-yarn-server-applicationhistoryservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  | +- org.objenesis:objenesis:jar:1.0:test
[INFO] |  |  | \- de.ruedigermoeller:fst:jar:2.50:test
[INFO] |  |  |\- com.cedarsoftware:java-util:jar:1.9.0:test
[INFO] |  |  |   \- com.cedarsoftware:json-io:jar:2.5.1:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  | \- org.apache.commons:commons-csv:jar:1.0:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-app:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.3.0-SNAPSHOT:test
[INFO] |  \- 
org.apache.hadoop:hadoop-mapreduce-client-hs:jar:3.3.0-SNAPSHOT:test
{code}

Hadoop-yarn-server-web-proxy is included by hadoop-mapreduce-client-app, but 
bouncy castle jar files are excluded in hadoop-yarn-server-web-proxy project.  
If other project depends on minicluster, it will bring in client jars as well 
as minicluster jar file AND yarn-server-web-proxy jar file.  This cause unit 
tests that depends on minicluster to reference the non-shaded version of 
hadoop-yarn-server-web-proxy classes, but the transitive dependencies are 
missing.


was (Author: eyang):
When looking at mvn dependency:tree, it shows that:

{code}
[INFO] +- org.apache.hadoop:hadoop-minicluster:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- org.apache.hadoop:hadoop-hdfs:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-yarn-server-tests:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-nodemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  |  \- 
org.apache.hadoop:hadoop-yarn-server-applicationhistoryservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  | +- org.objenesis:objenesis:jar:1.0:test
[INFO] |  |  | \- de.ruedigermoeller:fst:jar:2.50:test
[INFO] |  |  |\- com.cedarsoftware:java-util:jar:1.9.0:test
[INFO] |  |  |   \- com.cedarsoftware:json-io:jar:2.5.1:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  | \- org.apache.commons:commons-csv:jar:1.0:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-app:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.3.0-SNAPSHOT:test
[INFO] |  \- 
org.apache.hadoop:hadoop-mapreduce-client-hs:jar:3.3.0-SNAPSHOT:test
{code}

Hadoop-yarn-server-web-proxy is included by hadoop-mapreduce-client-app, but 
bouncy castle jar files are excluded in hadoop-yarn-server-web-proxy project.  
If other project depends on minicluster, it will brought in client jars as well 
as minicluster jar file AND yarn-server-web-proxy jar file.  This cause unit 
tests that depends on minicluster to reference the non-shaded version of 
hadoop-yarn-server-web-proxy classes, but the transitive dependencies are 
missing.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>

[jira] [Commented] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655708#comment-16655708
 ] 

Eric Yang commented on HADOOP-15832:


When looking at mvn dependency:tree, it shows that:

{code}
[INFO] +- org.apache.hadoop:hadoop-minicluster:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- org.apache.hadoop:hadoop-hdfs:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-yarn-server-tests:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-nodemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  |  \- 
org.apache.hadoop:hadoop-yarn-server-applicationhistoryservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  | +- org.objenesis:objenesis:jar:1.0:test
[INFO] |  |  | \- de.ruedigermoeller:fst:jar:2.50:test
[INFO] |  |  |\- com.cedarsoftware:java-util:jar:1.9.0:test
[INFO] |  |  |   \- com.cedarsoftware:json-io:jar:2.5.1:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.3.0-SNAPSHOT:test
[INFO] |  | \- org.apache.commons:commons-csv:jar:1.0:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:test-jar:tests:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-app:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  +- 
org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.3.0-SNAPSHOT:test
[INFO] |  |  \- 
org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.3.0-SNAPSHOT:test
[INFO] |  +- 
org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.3.0-SNAPSHOT:test
[INFO] |  \- 
org.apache.hadoop:hadoop-mapreduce-client-hs:jar:3.3.0-SNAPSHOT:test
{code}

Hadoop-yarn-server-web-proxy is included by hadoop-mapreduce-client-app, but 
bouncy castle jar files are excluded in hadoop-yarn-server-web-proxy project.  
If other project depends on minicluster, it will brought in client jars as well 
as minicluster jar file AND yarn-server-web-proxy jar file.  This cause unit 
tests that depends on minicluster to reference the non-shaded version of 
hadoop-yarn-server-web-proxy classes, but the transitive dependencies are 
missing.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15832.001.patch, HADOOP-15832.addendum.patch
>
>
> As part of my work on YARN-6586, I noticed that we're using a very old 
> version of BouncyCastle:
> {code:xml}
> 
>org.bouncycastle
>bcprov-jdk16
>1.46
>test
> 
> {code}
> The *-jdk16 artifacts have been discontinued and are not recommended (see 
> [http://bouncy-castle.1462172.n4.nabble.com/Bouncycaslte-bcprov-jdk15-vs-bcprov-jdk16-td4656252.html]).
>  
>  In particular, the newest release, 1.46, is from {color:#FF}2011{color}! 
>  [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk16]
> The currently maintained and recommended artifacts are *-jdk15on:
>  [https://www.bouncycastle.org/latest_releases.html]
>  They're currently on version 1.60, released only a few months ago.
> We should update BouncyCastle to the *-jdk15on artifacts and the 1.60 
> release. It's currently a test-only artifact, so there should be no 
> backwards-compatibility issues with updating this. It's also needed for 
> YARN-6586, where we'll actually be shipping it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-17 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654307#comment-16654307
 ] 

Eric Yang edited comment on HADOOP-15832 at 10/17/18 11:33 PM:
---

[~rkanter], MiniYARNCluster is the one failing.

In 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api
 project, run:

{code}
mvn clean test -Dtest=TestCleanupAfterKIll
{code}

I tried to include bouncycastle dependency in shade-plugin:

{code}
org.bouncycastle:*
{code}

So far, nothing works.  This maybe related to YARN-8448, where the dependency 
was introduced yesterday.


was (Author: eyang):
[~rkanter], MiniYARNCluster is the one failing.

In 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api
 project, run:

{code}
mvn clean test -Dtest=TestCleanupAfterKIll
{code}

I tried to include bouncycastle dependency and removed shade-plugin:

{code}
org.bouncycastle:*
{code}

So far, nothing works.  This maybe related to YARN-8448, where the dependency 
was introduced yesterday.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15832.001.patch, HADOOP-15832.addendum.patch
>
>
> As part of my work on YARN-6586, I noticed that we're using a very old 
> version of BouncyCastle:
> {code:xml}
> 
>org.bouncycastle
>bcprov-jdk16
>1.46
>test
> 
> {code}
> The *-jdk16 artifacts have been discontinued and are not recommended (see 
> [http://bouncy-castle.1462172.n4.nabble.com/Bouncycaslte-bcprov-jdk15-vs-bcprov-jdk16-td4656252.html]).
>  
>  In particular, the newest release, 1.46, is from {color:#FF}2011{color}! 
>  [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk16]
> The currently maintained and recommended artifacts are *-jdk15on:
>  [https://www.bouncycastle.org/latest_releases.html]
>  They're currently on version 1.60, released only a few months ago.
> We should update BouncyCastle to the *-jdk15on artifacts and the 1.60 
> release. It's currently a test-only artifact, so there should be no 
> backwards-compatibility issues with updating this. It's also needed for 
> YARN-6586, where we'll actually be shipping it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-17 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654307#comment-16654307
 ] 

Eric Yang edited comment on HADOOP-15832 at 10/17/18 10:32 PM:
---

[~rkanter], MiniYARNCluster is the one failing.

In 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api
 project, run:

{code}
mvn clean test -Dtest=TestCleanupAfterKIll
{code}

I tried to include bouncycastle dependency and removed shade-plugin:

{code}
org.bouncycastle:*
{code}

So far, nothing works.  This maybe related to YARN-8448, where the dependency 
was introduced yesterday.


was (Author: eyang):
[~rkanter], MiniYARNCluster is the one failing.

In 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api
 project, run:

{code}
mvn clean test -Dtest=TestCleanupAfterKIll
{code}

I tried to include bouncycastle dependency and removed shade-plugin:

{code}
org.bouncycastle:pcprov-jdk15on
{code}

So far, nothing works.  This maybe related to YARN-8448, where the dependency 
was introduced yesterday.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15832.001.patch, HADOOP-15832.addendum.patch
>
>
> As part of my work on YARN-6586, I noticed that we're using a very old 
> version of BouncyCastle:
> {code:xml}
> 
>org.bouncycastle
>bcprov-jdk16
>1.46
>test
> 
> {code}
> The *-jdk16 artifacts have been discontinued and are not recommended (see 
> [http://bouncy-castle.1462172.n4.nabble.com/Bouncycaslte-bcprov-jdk15-vs-bcprov-jdk16-td4656252.html]).
>  
>  In particular, the newest release, 1.46, is from {color:#FF}2011{color}! 
>  [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk16]
> The currently maintained and recommended artifacts are *-jdk15on:
>  [https://www.bouncycastle.org/latest_releases.html]
>  They're currently on version 1.60, released only a few months ago.
> We should update BouncyCastle to the *-jdk15on artifacts and the 1.60 
> release. It's currently a test-only artifact, so there should be no 
> backwards-compatibility issues with updating this. It's also needed for 
> YARN-6586, where we'll actually be shipping it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-17 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654307#comment-16654307
 ] 

Eric Yang commented on HADOOP-15832:


[~rkanter], MiniYARNCluster is the one failing.

In 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api
 project, run:

{code}
mvn clean test -Dtest=TestCleanupAfterKIll
{code}

I tried to include bouncycastle dependency and removed shade-plugin:

{code}
org.bouncycastle:pcprov-jdk15on
{code}

So far, nothing works.  This maybe related to YARN-8448, where the dependency 
was introduced yesterday.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15832.001.patch, HADOOP-15832.addendum.patch
>
>
> As part of my work on YARN-6586, I noticed that we're using a very old 
> version of BouncyCastle:
> {code:xml}
> 
>org.bouncycastle
>bcprov-jdk16
>1.46
>test
> 
> {code}
> The *-jdk16 artifacts have been discontinued and are not recommended (see 
> [http://bouncy-castle.1462172.n4.nabble.com/Bouncycaslte-bcprov-jdk15-vs-bcprov-jdk16-td4656252.html]).
>  
>  In particular, the newest release, 1.46, is from {color:#FF}2011{color}! 
>  [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk16]
> The currently maintained and recommended artifacts are *-jdk15on:
>  [https://www.bouncycastle.org/latest_releases.html]
>  They're currently on version 1.60, released only a few months ago.
> We should update BouncyCastle to the *-jdk15on artifacts and the 1.60 
> release. It's currently a test-only artifact, so there should be no 
> backwards-compatibility issues with updating this. It's also needed for 
> YARN-6586, where we'll actually be shipping it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15832) Upgrade BouncyCastle to 1.60

2018-10-17 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654265#comment-16654265
 ] 

Eric Yang commented on HADOOP-15832:


[~rkanter] Minicluster is throwing class not found exception for 
org.bouncycastle.operator.OperatorCreationException which is used by 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/ProxyCA.java.

Minicluster fails during RM initialization.

> Upgrade BouncyCastle to 1.60
> 
>
> Key: HADOOP-15832
> URL: https://issues.apache.org/jira/browse/HADOOP-15832
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15832.001.patch, HADOOP-15832.addendum.patch
>
>
> As part of my work on YARN-6586, I noticed that we're using a very old 
> version of BouncyCastle:
> {code:xml}
> 
>org.bouncycastle
>bcprov-jdk16
>1.46
>test
> 
> {code}
> The *-jdk16 artifacts have been discontinued and are not recommended (see 
> [http://bouncy-castle.1462172.n4.nabble.com/Bouncycaslte-bcprov-jdk15-vs-bcprov-jdk16-td4656252.html]).
>  
>  In particular, the newest release, 1.46, is from {color:#FF}2011{color}! 
>  [https://mvnrepository.com/artifact/org.bouncycastle/bcprov-jdk16]
> The currently maintained and recommended artifacts are *-jdk15on:
>  [https://www.bouncycastle.org/latest_releases.html]
>  They're currently on version 1.60, released only a few months ago.
> We should update BouncyCastle to the *-jdk15on artifacts and the 1.60 
> release. It's currently a test-only artifact, so there should be no 
> backwards-compatibility issues with updating this. It's also needed for 
> YARN-6586, where we'll actually be shipping it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640429#comment-16640429
 ] 

Eric Yang commented on HADOOP-15821:


[~elgoiri] YARN classpath is missing 
$HADOOP_HOME/share/hadoop/tools/lib/hadoop-registry*.jar, which prevents YARN 
resource manager from working properly.

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch, HADOOP-15821.004.patch, 
> HADOOP-15821.005.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640280#comment-16640280
 ] 

Eric Yang commented on HADOOP-15821:


[~elgoiri] For ServiceDiscovery.md, it is probably best to stay in YARN-service 
and only move "Configure Registry DNS", "Start Registry DNS Server", and "Make 
your cluster use Registry DNS" topics to hadoop-registry.

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch, 
> HADOOP-15821.002.patch, HADOOP-15821.003.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15821) Move Hadoop YARN Registry to Hadoop Registry

2018-10-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638990#comment-16638990
 ] 

Eric Yang commented on HADOOP-15821:


{quote}Had to change YarnConfiguration, YarnUncaughtExceptionHandler and 
YarnRuntimeException.{quote}

The class name change is not a problem.  No regression from user point of view.

{quote}Backwards compatibility: I'm not sure if this is public, should we 
create bindings for backwards compatibility?{quote}

Configuration keys are using hadoop.registry.*, it is probably ok to have them 
move to core-default.xml without having to do much.

{quote}References to YARN: in the current patch, there are a couple references 
to YARN, should we move them?{quote}

I am in favor of s/YARN/Hadoop/ for the remaining references.

> Move Hadoop YARN Registry to Hadoop Registry
> 
>
> Key: HADOOP-15821
> URL: https://issues.apache.org/jira/browse/HADOOP-15821
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HADOOP-15821.000.patch, HADOOP-15821.001.patch
>
>
> Currently, Hadoop YARN Registry is in YARN. However, this can be used by 
> other parts of the project (e.g., HDFS). In addition, it does not have any 
> real dependency to YARN.
> We should move it into commons and make it Hadoop Registry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15765) Can not find login module class for IBM due to hard codes

2018-09-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623890#comment-16623890
 ] 

Eric Yang commented on HADOOP-15765:


[~jiangjianfei] How about IBM Java version detection to make decision which 
security class to use?  I believe that was the approach taken by existing 
developers.

> Can not find login module class for IBM due to hard codes
> -
>
> Key: HADOOP-15765
> URL: https://issues.apache.org/jira/browse/HADOOP-15765
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.3
>Reporter: Jianfei Jiang
>Priority: Major
> Attachments: HADOOP-15765_000.patch
>
>
> As the differences between various versions of IBM, the login module class is 
> sometimes different. However, the class for specified jdk (no matter the 
> version) is hard coded in Hadoop code. We have faced the error like following:
> *javax.security.auth.login.LoginException: unable to find LoginModule class: 
> com.ibm.security.auth.module.LinuxLoginModule*
>  
> Should we set the value as a config which can be set by users?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15774) Discovery of HA servers

2018-09-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623860#comment-16623860
 ] 

Eric Yang commented on HADOOP-15774:


RegistryDNS is a tool to provide DNS resolution.  If namenode hostname follows 
certain convention, then it might be possible to provide multi-A record to 
provide information about all namenodes.  This is assuming that namdenode would 
know how to find out about location of ZooKeeper servers via configuration file 
to publish themselves.  If ZooKeeper quorum can only be discovered by 
configuration, then there is still no real service discovery.

It would be nice to move registryDNS from YARN to Hadoop common, but 
dynamically spawned ZooKeeper server is broken currently.  ZooKeeper needs to 
be fixed in the following area to meet the requirements for service discovery 
for Hadoop core system:

# ZooKeeper client session affinity.  If ZooKeeper server spawned somewhere 
else, existing client still try to connect to old IP.  ZOOKEEPER-2929
# ZooKeeper Kerberos security fails, but continue to allow ZooKeeper 
connections.  ZOOKEEPER-1634

YARN usage of ZooKeeper and RegistryDNS are less prone to ZooKeeper defects 
because assumption of ZooKeeper deployment is static, and curator handles some 
odd conditions for ZooKeeper connections.  Curator dependency might generate a 
wave of realignment for HDFS dependency on ZooKeeper to get subpar results.

Alternative approach is to use multicast DNS for daemon process that supports 
service discovery.  Similar to Apple products that can find out about devices 
in surrounding environment.  There is a implementation of service discovery in 
[this 
project|https://github.com/macroadster/HMS/blob/master/beacon/src/main/java/org/apache/hms/beacon/Beacon.java]
 for ZooKeeper. This was canned 6 years ago when [~aw] and others provided 
feedback that said multicast DNS is too expensive and possible vulnerabilities. 
 Many issues were addressed by multicast DNS implementations in the last 6 
years.  Not sure if it is worth while to revisit multicast DNS for service 
discovery purpose.

DNS or multicast DNS are most likely the best approach for providing discovery 
of HA servers.  However, the information are provided to be aware of potential 
pitfalls in the current implementations.



> Discovery of HA servers
> ---
>
> Key: HADOOP-15774
> URL: https://issues.apache.org/jira/browse/HADOOP-15774
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Íñigo Goiri
>Priority: Major
> Attachments: Discovery Service.pdf
>
>
> Currently, Hadoop relies on configuration files to specify the servers.
> This requires maintaining these configuration files and propagating the 
> changes.
> Hadoop should have a framework to provide discovery.
> For example, in HDFS, we could define the Namenodes in a shared location and 
> the DNs would use the framework to find the Namenodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15765) Can not find login module class for IBM due to hard codes

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620769#comment-16620769
 ] 

Eric Yang commented on HADOOP-15765:


If the security class is configurable via config file, then config file must be 
owned by root or read only by the user who runs the JVM.  This prevents runtime 
hacking to subvert the security class.  In Hadoop, there is very little 
security check to ensure the config value is coming from a read only source.  
It is best to avoid doing configurable security class loading.  I think updates 
to the existing hard code list is still preferred solution.

> Can not find login module class for IBM due to hard codes
> -
>
> Key: HADOOP-15765
> URL: https://issues.apache.org/jira/browse/HADOOP-15765
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.3
>Reporter: Jianfei Jiang
>Priority: Major
> Attachments: HADOOP-15765_000.patch
>
>
> As the differences between various versions of IBM, the login module class is 
> sometimes different. However, the class for specified jdk (no matter the 
> version) is hard coded in Hadoop code. We have faced the error like following:
> *javax.security.auth.login.LoginException: unable to find LoginModule class: 
> com.ibm.security.auth.module.LinuxLoginModule*
>  
> Should we set the value as a config which can be set by users?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15758) Filesystem.get(URI, Configuration, user) API not working with proxy users

2018-09-17 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617983#comment-16617983
 ] 

Eric Yang commented on HADOOP-15758:


[~hgadre] {quote}(b) if a ticket cache path is not specified and user name is 
provided, it creates a remote user {quote}

Ticket cache must be verified prior to create a remote user.  Without a 
validate ticket, Java code should not have access to create a remote user.  
Proxy user check must be in place on server side to prevent security hole.

{quote}application provide the user name as well as the ticket cache path. The 
question is should it treat this as a proxy user scenario?{quote}

This seem like valid use case that spark and hive would depend on.

> Filesystem.get(URI, Configuration, user) API not working with proxy users
> -
>
> Key: HADOOP-15758
> URL: https://issues.apache.org/jira/browse/HADOOP-15758
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
>Priority: Major
> Attachments: HADOOP-15758-001.patch
>
>
> A user reported that the Filesystem.get API is not working as expected when 
> they use the 'FileSystem.get(URI, Configuration, user)' method signature - 
> but 'FileSystem.get(URI, Configuration)' works fine. The user is trying to 
> use this method signature to mimic proxy user functionality e.g. provide 
> ticket cache based kerberos credentials (using KRB5CCNAME env variable) for 
> the proxy user and then in the java program pass name of the user to be 
> impersonated. The alternative, to use [proxy users 
> functionality|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html]
>  in Hadoop works as expected.
>  
> Since FileSystem.get(URI, Configuration, user) is a public API and it does 
> not restrict its usage in this fashion, we should ideally make it work or add 
> docs to discourage its usage to implement proxy users.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14295) Authentication proxy filter may fail authorization because of getRemoteAddr

2018-09-17 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-14295:
---
Target Version/s:   (was: 3.3.0)
  Status: Open  (was: Patch Available)

AuthenticationWithProxyUserFilter.java is reverted from trunk due to security 
mailing list discussion for reverting HADOOP-13119.  This prevents this path to 
be resolved, and the patch needs to be cancelled for now.

> Authentication proxy filter may fail authorization because of getRemoteAddr
> ---
>
> Key: HADOOP-14295
> URL: https://issues.apache.org/jira/browse/HADOOP-14295
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.8.1, 3.0.0-alpha2, 2.7.4
>Reporter: Jeffrey E  Rodriguez
>Assignee: Jeffrey E  Rodriguez
>Priority: Critical
> Attachments: HADOOP-14295.002.patch, HADOOP-14295.003.patch, 
> HADOOP-14295.004.patch, hadoop-14295.001.patch
>
>
> When we turn on Hadoop UI Kerberos and try to access Datanode /logs the proxy 
> (Knox) would get an Authorization failure and it hosts would should as 
> 127.0.0.1 even though Knox wasn't in local host to Datanode, error message:
> {quote}
> "2017-04-08 07:01:23,029 ERROR security.AuthenticationWithProxyUserFilter 
> (AuthenticationWithProxyUserFilter.java:getRemoteUser(94)) - Unable to verify 
> proxy user: Unauthorized connection for super-user: knox from IP 127.0.0.1"
> {quote}
> We were able to figure out that Datanode have Jetty listening on localhost 
> and that Netty is used to server request to DataNode, this was a measure to 
> improve performance because of Netty Async NIO design.
> I propose to add a check for x-forwarded-for header since proxys usually 
> inject that header before we do a getRemoteAddr



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15725) FileSystem.deleteOnExit should check user permissions

2018-09-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607365#comment-16607365
 ] 

Eric Yang commented on HADOOP-15725:


[~oshevchenko] Thank you for the example, and I confirmed that if the program 
is running as root user, and HDFS is also running as root user.  The delete 
call will carry out.  If the java code is running as a different user than hdfs 
super user, then the file doesn't get deleted.  It does look like a bug that 
using super user to impersonate a less privileged user to perform operation, 
and the operation still carry out with super user privilege is a concerning 
problem and needs to be addressed.  Both secure and insecure cluster seems to 
have the same bug that super user impersonate a lesser privileged user doesn't 
prevent invocation of super user power.  

You are correct that this is not acceptable for insecure cluster as well.  
Before this bug is fixed, the rule of thumb is to prevent running program with 
hdfs super user.  It is too mighty powerful.  At least, kerberos enabled 
cluster, user must have login to KDC with hdfs super user to exploit this bug.  
It is not easily carry out by unauthorized user.

> FileSystem.deleteOnExit should check user permissions
> -
>
> Key: HADOOP-15725
> URL: https://issues.apache.org/jira/browse/HADOOP-15725
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Oleksandr Shevchenko
>Priority: Major
>  Labels: Security
> Attachments: deleteOnExitReproduce
>
>
> For now, we able to add any file to FileSystem deleteOnExit list. It leads to 
> security problems. Some user (Intruder) can get file system instance which 
> was created by another user (Owner) and mark any files to delete even if 
> "Intruder" doesn't have any access to this files. Later when "Owner" invoke 
> close method (or JVM is shut down since we have ShutdownHook which able to 
> close all file systems) marked files will be deleted successfully since 
> deleting was do behalf of "Owner" (or behalf of a user who ran a program).
> I attached the patch [^deleteOnExitReproduce] which reproduces this 
> possibility and also I able to reproduce it on a cluster with both Local and 
> Distributed file systems:
> {code:java}
> public class Main {
> public static void main(String[] args) throws Exception {
> final FileSystem fs;
>  Configuration conf = new Configuration();
>  conf.set("fs.default.name", "hdfs://node:9000");
>  conf.set("fs.hdfs.impl",
>  org.apache.hadoop.hdfs.DistributedFileSystem.class.getName()
>  );
>  fs = FileSystem.get(conf);
>  System.out.println(fs);
> Path f = new Path("/user/root/testfile");
>  System.out.println(f);
> UserGroupInformation hive = UserGroupInformation.createRemoteUser("hive");
> hive.doAs((PrivilegedExceptionAction) () -> fs.deleteOnExit(f));
> fs.close();
>  }
> {code}
> Result:
> {noformat}
> root@node:/# hadoop fs -put testfile /user/root
> root@node:/# hadoop fs -chmod 700 /user/root/testfile
> root@node:/# hadoop fs -ls /user/root
> Found 1 items
> -rw--- 1 root supergroup 0 2018-09-06 18:07 /user/root/testfile
> root@node:/# java -jar testDeleteOther.jar 
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.conf.Configuration.deprecation).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_309539034_1, ugi=root 
> (auth:SIMPLE)]]
> /user/root/testfile
> []
> root@node:/# hadoop fs -ls /user/root
> root@node:/# 
> {noformat}
> We should add a check user permissions before mark a file to delete. 
>  Could someone evaluate this? And if no one objects I would like to start 
> working on this.
>  Thanks a lot for any comments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15725) FileSystem.deleteOnExit should check user permissions

2018-09-06 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606317#comment-16606317
 ] 

Eric Yang commented on HADOOP-15725:


I believe this is a non-Kerberos setup issue.  In non-Kerberos cluster, 
UserGroupInformation.isSecurityEnabled()==false, and it will by pass all 
security check in Hadoop code base.  Therefore, if you use 
UserGroupInformation.createRemoteUser(), there is no security enforcing 
impersonation check and allow the operation to be carried out as root user.  
Try this on a Kerberos setup, and see if you get the same result.  Same 
operation shouldn't work on Kerberos enabled cluster.

> FileSystem.deleteOnExit should check user permissions
> -
>
> Key: HADOOP-15725
> URL: https://issues.apache.org/jira/browse/HADOOP-15725
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Oleksandr Shevchenko
>Priority: Major
>  Labels: Security
> Attachments: deleteOnExitReproduce
>
>
> For now, we able to add any file to FileSystem deleteOnExit list. It leads to 
> security problems. Some user (Intruder) can get file system instance which 
> was created by another user (Owner) and mark any files to delete even if 
> "Intruder" doesn't have any access to this files. Later when "Owner" invoke 
> close method (or JVM is shut down since we have ShutdownHook which able to 
> close all file systems) marked files will be deleted successfully since 
> deleting was do behalf of "Owner" (or behalf of a user who ran a program).
> I attached the patch [^deleteOnExitReproduce] which reproduces this 
> possibility and also I able to reproduce it on a cluster with both Local and 
> Distributed file systems:
> {code:java}
> public class Main {
> public static void main(String[] args) throws Exception {
> final FileSystem fs;
>  Configuration conf = new Configuration();
>  conf.set("fs.default.name", "hdfs://node:9000");
>  conf.set("fs.hdfs.impl",
>  org.apache.hadoop.hdfs.DistributedFileSystem.class.getName()
>  );
>  fs = FileSystem.get(conf);
>  System.out.println(fs);
> Path f = new Path("/user/root/testfile");
>  System.out.println(f);
> UserGroupInformation hive = UserGroupInformation.createRemoteUser("hive");
> hive.doAs((PrivilegedExceptionAction) () -> fs.deleteOnExit(f));
> fs.close();
>  }
> {code}
> Result:
> {noformat}
> root@node:/# hadoop fs -put testfile /user/root
> root@node:/# hadoop fs -chmod 700 /user/root/testfile
> root@node:/# hadoop fs -ls /user/root
> Found 1 items
> -rw--- 1 root supergroup 0 2018-09-06 18:07 /user/root/testfile
> root@node:/# java -jar testDeleteOther.jar 
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.conf.Configuration.deprecation).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_309539034_1, ugi=root 
> (auth:SIMPLE)]]
> /user/root/testfile
> []
> root@node:/# hadoop fs -ls /user/root
> root@node:/# 
> {noformat}
> We should add a check user permissions before mark a file to delete. 
>  Could someone evaluate this? And if no one objects I would like to start 
> working on this.
>  Thanks a lot for any comments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15670) UserGroupInformation TGT renewer thread doesn't use monotonically increasing time for calculating interval to sleep

2018-08-16 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang resolved HADOOP-15670.

Resolution: Not A Problem

> UserGroupInformation TGT renewer thread doesn't use monotonically increasing 
> time for calculating interval to sleep
> ---
>
> Key: HADOOP-15670
> URL: https://issues.apache.org/jira/browse/HADOOP-15670
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
>Priority: Minor
>
> As per the [documentation of Time#now() 
> method|https://github.com/apache/hadoop/blob/74411ce0ce7336c0f7bb5793939fdd64a5dcdef6/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java#L49-L57],
>  it should not be used for calculating duration or interval to sleep. But the 
> TGT renewer thread in UserGroupInformation object doesn't follow this 
> recommendation,
> [https://github.com/apache/hadoop/blob/74411ce0ce7336c0f7bb5793939fdd64a5dcdef6/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L892-L899]
> This should be fixed to use Time.monotonicNow() API instead.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15670) UserGroupInformation TGT renewer thread doesn't use monotonically increasing time for calculating interval to sleep

2018-08-16 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583182#comment-16583182
 ] 

Eric Yang commented on HADOOP-15670:


[~hgadre] There are configuration in KDC in minutes for Kerberos V5 tolerates 
between the time on the client clock and the time on the domain controller that 
provides Kerberos authentication.  The time difference in nano second and 
milliseconds of the current time stamp is inconsequential to the renewal sleep 
thread because UGI renewal thread always renew minutes ahead of the expiration 
time.  Hence, this is not a real problem.

> UserGroupInformation TGT renewer thread doesn't use monotonically increasing 
> time for calculating interval to sleep
> ---
>
> Key: HADOOP-15670
> URL: https://issues.apache.org/jira/browse/HADOOP-15670
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
>Priority: Minor
>
> As per the [documentation of Time#now() 
> method|https://github.com/apache/hadoop/blob/74411ce0ce7336c0f7bb5793939fdd64a5dcdef6/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java#L49-L57],
>  it should not be used for calculating duration or interval to sleep. But the 
> TGT renewer thread in UserGroupInformation object doesn't follow this 
> recommendation,
> [https://github.com/apache/hadoop/blob/74411ce0ce7336c0f7bb5793939fdd64a5dcdef6/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L892-L899]
> This should be fixed to use Time.monotonicNow() API instead.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-26 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15593:
---
   Resolution: Fixed
Fix Version/s: 3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

Thank you [~gabor.bota] for the patch.
Thank you [~xiaochen] for the review.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch, HADOOP-15593.005.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-26 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559019#comment-16559019
 ] 

Eric Yang commented on HADOOP-15593:


[~xiaochen] Thank you for the review.
+1 patch 5 looks good to me, will commit shortly.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch, HADOOP-15593.005.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15600) Set default proxy user settings to non-routable IP addresses and default users group

2018-07-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556923#comment-16556923
 ] 

Eric Yang commented on HADOOP-15600:


For good intend and practical purpose, the default ASF release may assume admin 
== root like Redhat or MacOSX does.  The more exotics features that [~daryn] 
listed are pass through for admin users.


> Set default proxy user settings to non-routable IP addresses and default 
> users group
> 
>
> Key: HADOOP-15600
> URL: https://issues.apache.org/jira/browse/HADOOP-15600
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Reporter: Eric Yang
>Priority: Major
>
> The default setting to restrict the cluster nodes to communicate with peer 
> nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
> hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
> which allows any hosts to impersonate any user.
> The proposal is to default settings to:
> {code}
> 
>   hadoop.proxyuser.hdfs.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.hdfs.groups
>   wheel
> 
> 
>   hadoop.proxyuser.yarn.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.yarn.groups
>   users
> 
> {code}
> This will allow the cluster to default to a closed network and default 
> "users" group to reduce risks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15600) Set default proxy user settings to non-routable IP addresses and default users group

2018-07-25 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15600:
---
Description: 
The default setting to restrict the cluster nodes to communicate with peer 
nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
which allows any hosts to impersonate any user.

The proposal is to default settings to:

{code}

  hadoop.proxyuser.hdfs.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.hdfs.groups
  wheel



  hadoop.proxyuser.yarn.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.yarn.groups
  users

{code}

This will allow the cluster to default to a closed network and default "users" 
group to reduce risks.

  was:
The default setting to restrict the cluster nodes to communicate with peer 
nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
which allows any hosts to impersonate any user.

The proposal is to default settings to:

{code}

  hadoop.proxyuser.hdfs.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.hdfs.groups
  users



  hadoop.proxyuser.yarn.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.yarn.groups
  users

{code}

This will allow the cluster to default to a closed network and default "users" 
group to reduce risks.


> Set default proxy user settings to non-routable IP addresses and default 
> users group
> 
>
> Key: HADOOP-15600
> URL: https://issues.apache.org/jira/browse/HADOOP-15600
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Reporter: Eric Yang
>Priority: Major
>
> The default setting to restrict the cluster nodes to communicate with peer 
> nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
> hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
> which allows any hosts to impersonate any user.
> The proposal is to default settings to:
> {code}
> 
>   hadoop.proxyuser.hdfs.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.hdfs.groups
>   wheel
> 
> 
>   hadoop.proxyuser.yarn.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.yarn.groups
>   users
> 
> {code}
> This will allow the cluster to default to a closed network and default 
> "users" group to reduce risks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15600) Set default proxy user settings to non-routable IP addresses and default users group

2018-07-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556907#comment-16556907
 ] 

Eric Yang commented on HADOOP-15600:


[~daryn] Thank you for your reply, but not everyone has the time to dedicate 
their life into serving the servers to tweak every configuration to keep the 
cluster secure.  Hadoop has become difficult to secure because the default 
configuration is open.  It is arguing against common sense that anyone who 
tries to deploy a Hadoop cluster must be a Hadoop security expert before they 
can deploy.  There are already bots developed to attack Hadoop clusters.  It is 
not practical to demand users to have the perfect configuration before they can 
start any Hadoop daemon.  Bigtop and Ambari already have default proxy user 
configuration.  It would be equally dangerous to run without proxy user 
configuration because node manager would be able to read/write data node blocks 
when they are owned by the same user.  What do you think people do when they 
deploy KMS?  Do they deploy KMS server as hdfs user or another user?  There are 
scenarios that have blurry and vague suggestions and we both know that you 
don't have all answers.  The community needs to reach a better default settings 
and tighten security in code to help new users fend off bot attacks.  This is 
the intend of the JIRA.  It would not be productive to start religious debate 
that admin != root.  Please keep the discussion civil.  Thanks

> Set default proxy user settings to non-routable IP addresses and default 
> users group
> 
>
> Key: HADOOP-15600
> URL: https://issues.apache.org/jira/browse/HADOOP-15600
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Reporter: Eric Yang
>Priority: Major
>
> The default setting to restrict the cluster nodes to communicate with peer 
> nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
> hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
> which allows any hosts to impersonate any user.
> The proposal is to default settings to:
> {code}
> 
>   hadoop.proxyuser.hdfs.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.hdfs.groups
>   users
> 
> 
>   hadoop.proxyuser.yarn.hosts
>   
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16
> 
> 
>   hadoop.proxyuser.yarn.groups
>   users
> 
> {code}
> This will allow the cluster to default to a closed network and default 
> "users" group to reduce risks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556408#comment-16556408
 ] 

Eric Yang commented on HADOOP-15593:


[~xiaochen] Any concern with patch 005?

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch, HADOOP-15593.005.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555140#comment-16555140
 ] 

Eric Yang commented on HADOOP-15593:


[~xiaochen] If tgt.getEndTime().getTime() captured the NullPointerException, 
tgtEndTime = now, and nextRefresh is always smaller than now, renewal thread 
will not try to renew once more, and thread stops earlier than expected.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554903#comment-16554903
 ] 

Eric Yang commented on HADOOP-15593:


[~xiaochen] Good catch on the logic.  I think your suggestion to check 
isDestroy to stop the renew thread make sense.  Patch 004 is incomplete.  If a 
tgt is destroyed, it can not be renewed.  This will stop the renew thread:

{code}
  if (now > nextRefresh) {
LOG.error("TGT is expired. Aborting renew thread for {}.",
getUserName());
return;
  }
{code}

This part of code needs to be removed for the renewal thread to retry.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554659#comment-16554659
 ] 

Eric Yang commented on HADOOP-15593:


[~xiaochen] Renewal thread supposed to run until max_renewable_life has been 
reached.  If ticket end time is expired or unknown, but max_renewable_life is 
not expired, we would want the renewal loop to run.  Maybe there is KDC outages 
to cause endTime = null, but retry should be attempted.  Hadoop doesn't seem to 
have logic to check max_renewable_life, therefore, it may keep trying to 
prevent service outage for now.  We probably want to open another ticket for 
enhancement to respect max_renewable_life.  It is safer to retry than having 
cluster goes down because Kerberos tickets can not be renewed.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554453#comment-16554453
 ] 

Eric Yang commented on HADOOP-15593:


[~gabor.bota] Thank you for the patch.  In patch 003, you have a tgt null check 
before entering the while loop.  NPE will be thrown only for endTime = new 
Date(null).  Unless tgt isDestroyed or uninitialized, otherwise, tgt always 
have an end time.  This is the reason that proposal can work.  Patch 004 
proposal can work equally well.  Therefore, +1 from my point of view.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-23 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553145#comment-16553145
 ] 

Eric Yang commented on HADOOP-15593:


Two possible approaches to fix the findbug error:
{code}
Date endTime = tgt.getEndTime();
if (tgt != null && endTime != null && !tgt.isDestroyed()) {
  tgtEndTime = endTime.getTime();
} else {
  tgtEndTime = now;
}
{code}

or

{code}
  try {
Date endTime = tgt.getEndTime();
  tgtEndTime = endTime.getTime();
  } catch (NullPointerException npe) {
LOG.warn("NPE thrown while getting KerberosTicket endTime. The "
+ "endTime will be set to Time.now()");
tgtEndTime = now;
  }
{code}

Both will work equally well.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551442#comment-16551442
 ] 

Eric Yang edited comment on HADOOP-15593 at 7/21/18 12:04 AM:
--

What if we catch the null pointer exception and reset tgtEndTime to now?  When 
tgtEndTime is undefined for any reasons, with tgtEndTime reset to now, should 
have no ill effect within the scope of getNextTgtRenewalTime.


was (Author: eyang):
What if we catch the null pointer and reset tgtEndTime to now?  When tgtEndTime 
is undefined for any reasons, with tgtEndTime reset to now, should have no ill 
effect within the scope of getNextTgtRenewalTime.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551442#comment-16551442
 ] 

Eric Yang commented on HADOOP-15593:


What if we catch the null pointer and reset tgtEndTime to now?  When tgtEndTime 
is undefined for any reasons, with tgtEndTime reset to now, should have no ill 
effect within the scope of getNextTgtRenewalTime.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550908#comment-16550908
 ] 

Eric Yang commented on HADOOP-15593:


[~gabor.bota] Thank you for creating the second issue for tracking.  Jenkins 
Precommit build is triggered for patch 02.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549994#comment-16549994
 ] 

Eric Yang commented on HADOOP-15593:


[~leftnoteasy] This issue could cause YarnRegistryDNS to stop working.  It is 
best to have this issue included in 3.1.1 release.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-19 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15593:
---
Target Version/s: 3.1.1
Priority: Blocker  (was: Critical)

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549472#comment-16549472
 ] 

Eric Yang commented on HADOOP-15593:


[~gabor.bota] I know you are trying to retain existing behavior, but I think 
there are bugs in existing logic.  The calculation of nextRefresh is based on:

{code}
nextRefresh = Math.max(getRefreshTime(tgt),
  now + kerberosMinSecondsBeforeRelogin);
{code}

Most of the time nextRefresh = getRefreshTime(tgt).  If it is renewing exactly 
on refreshTime, and there are parallel operations using expired ticket.  There 
is a time gap that some operations might not perform until the next tgt is 
obtained.  Ideally, we want to keep service uninterrupted, therefore 
getNextTgtRenewalTime  supposed to calculate the time a few minutes before 
Kerberos tgt expired to determine the nextRefresh time.  It looks like we are 
not using getNextTgtRenewalTime method to calculate nextRefresh instead opt-in 
to use ticket expiration time as base line for nextRefresh.  I think patch 2 
approach can create time gap then strain on KDC server when ticket can not be 
renewed.  It would be better to calculate nextRefresh based on 
getNextTgtRenewalTime.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15610:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~aw] Thank you for the review.
[~jackbearden] Thank you for the patch.

I just committed this to trunk and branch-3.1.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch, HADOOP-15610.004.patch, HADOOP-15610.005.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548505#comment-16548505
 ] 

Eric Yang commented on HADOOP-15610:


Jenkins shows:

{code}
Collecting pylint==1.9.2
  Downloading 
https://files.pythonhosted.org/packages/f2/95/0ca03c818ba3cd14f2dd4e95df5b7fa232424b7fc6ea1748d27f293bc007/pylint-1.9.2-py2.py3-none-any.whl
 (690kB)
Collecting singledispatch; python_version < "3.4" (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/c5/10/369f50bcd4621b263927b0a1519987a04383d4a98fb10438042ad410cf88/singledispatch-3.4.0.3-py2.py3-none-any.whl
Collecting isort>=4.2.5 (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/41/d8/a945da414f2adc1d9e2f7d6e7445b27f2be42766879062a2e63616ad4199/isort-4.3.4-py2-none-any.whl
 (45kB)
Collecting configparser; python_version == "2.7" (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/7c/69/c2ce7e91c89dc073eb1aa74c0621c3eefbffe8216b3f9af9d3885265c01c/configparser-3.5.0.tar.gz
Collecting backports.functools-lru-cache; python_version == "2.7" (from 
pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/03/8e/2424c0e65c4a066e28f539364deee49b6451f8fcd4f718fefa50cc3dcf48/backports.functools_lru_cache-1.5-py2.py3-none-any.whl
Collecting mccabe (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/87/89/479dc97e18549e21354893e4ee4ef36db1d237534982482c3681ee6e7b57/mccabe-0.6.1-py2.py3-none-any.whl
Collecting astroid<2.0,>=1.6 (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/0e/9b/18b08991c8c6aaa827faf394f4468b8fee41db1f73aa5157f9f5fb2e69c3/astroid-1.6.5-py2.py3-none-any.whl
 (293kB)
Collecting six (from pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Collecting futures (from isort>=4.2.5->pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/2d/99/b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0/futures-3.2.0-py2-none-any.whl
Collecting enum34>=1.1.3; python_version < "3.4" (from 
astroid<2.0,>=1.6->pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/c5/db/e56e6b4bbac7c4a06de1c50de6fe1ef3810018ae11732a50f15f62c7d050/enum34-1.1.6-py2-none-any.whl
Collecting wrapt (from astroid<2.0,>=1.6->pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/a0/47/66897906448185fcb77fc3c2b1bc20ed0ecca81a0f2f88eda3fc5a34fc3d/wrapt-1.10.11.tar.gz
Collecting lazy-object-proxy (from astroid<2.0,>=1.6->pylint==1.9.2)
  Downloading 
https://files.pythonhosted.org/packages/52/7e/f0f570ba363e15251bb9fd452257ec2aff91be0187a08a893afbd8ae225f/lazy_object_proxy-1.3.1-cp27-cp27mu-manylinux1_x86_64.whl
 (56kB)
Building wheels for collected packages: configparser, wrapt
  Running setup.py bdist_wheel for configparser: started
  Running setup.py bdist_wheel for configparser: finished with status 'done'
  Stored in directory: 
/root/.cache/pip/wheels/a3/61/79/424ef897a2f3b14684a7de5d89e8600b460b89663e6ce9d17c
  Running setup.py bdist_wheel for wrapt: started
  Running setup.py bdist_wheel for wrapt: finished with status 'done'
  Stored in directory: 
/root/.cache/pip/wheels/48/5d/04/22361a593e70d23b1f7746d932802efe1f0e523376a74f321e
Successfully built configparser wrapt
Installing collected packages: six, singledispatch, futures, isort, 
configparser, backports.functools-lru-cache, mccabe, enum34, wrapt, 
lazy-object-proxy, astroid, pylint
Successfully installed astroid-1.6.5 backports.functools-lru-cache-1.5 
configparser-3.5.0 enum34-1.1.6 futures-3.2.0 isort-4.3.4 
lazy-object-proxy-1.3.1 mccabe-0.6.1 pylint-1.9.2 singledispatch-3.4.0.3 
six-1.11.0 wrapt-1.10.11
{code}

Running setup.py is equivalent to running recompile for some python modules.  
It is not as static as .deb packages, but patch 5 will work until next 
breakage.  +1 for patch 5 too.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch, HADOOP-15610.004.patch, HADOOP-15610.005.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548467#comment-16548467
 ] 

Eric Yang commented on HADOOP-15610:


[~aw] Hadoop has very little python code.  (3 files to be exact).  The delta 
between Pylint 1.5 and 1.9 are the following new features:

- generated-members
- redefining-builtins-modules
- unsupported-assignment-operation
- unsupported-delete-operation
- tailing-newlines
- useless-super-delegations
- used-prior-global-declaration in Python 3.6 syntax
- python 3 eq-without-hash
- lots of python 3 checks

I don't think any of the existing python code requires any of the new features. 
 Tailing-newlines is probably the only feature that may be encountered during 
the check, if the python files are modified in Hadoop source code.  I think 
this is still safer than using a pylint that fails encoding test, and save 
quite a bit of time to recompile pylint with fluid dependencies that might 
change between purges of docker image cache.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch, HADOOP-15610.004.patch, HADOOP-15610.005.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548439#comment-16548439
 ] 

Eric Yang commented on HADOOP-15610:


[~aw] Reason?

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch, HADOOP-15610.004.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548390#comment-16548390
 ] 

Eric Yang commented on HADOOP-15610:


+1 for patch 004 pending Jenkins reports.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch, HADOOP-15610.004.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548383#comment-16548383
 ] 

Eric Yang commented on HADOOP-15593:


[~gabor.bota] Thank you for the patch.  It might be safer to have 
nextRefreshTime to be earlier than actual end time to prevent disruption of 
service.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548185#comment-16548185
 ] 

Eric Yang commented on HADOOP-15610:


[~jackbearden] [~aw] Can we can use Ubuntu 16.04 provided pylint?  apt-get -q 
install -y pylint can provide a version of pylint 1.5.2.  Is this version 
adequate for Yetus?

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Critical
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch, 
> HADOOP-15610.003.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548102#comment-16548102
 ] 

Eric Yang commented on HADOOP-15610:


I thought pylint might have trouble with python 2, but [pylint 
community|https://github.com/PyCQA/pylint/issues/1763] claimed support python 
2.7 until 2020.  As long as Yetus community verifies the failed test cases 
don't impact reporting results, I think this patch can be committed.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Minor
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548075#comment-16548075
 ] 

Eric Yang commented on HADOOP-15610:


I think the error is coming from pylint compiled itself against wrong version 
of python in the docker image:

{code}
pylint --version
No config file found, using default configuration
pylint 1.9.2, 
astroid 1.6.5
Python 2.7.12 (default, Dec  4 2017, 14:50:18)
{code}

Probably need to change first line of pip to python3 binary to compile pylint 
package against python 3.5.

> Hadoop Docker Image Pip Install Fails
> -
>
> Key: HADOOP-15610
> URL: https://issues.apache.org/jira/browse/HADOOP-15610
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Jack Bearden
>Priority: Minor
>  Labels: docker, trunk
> Attachments: HADOOP-15610.001.patch, HADOOP-15610.002.patch
>
>
> The Hadoop Docker image on trunk does not build. The pip package on the 
> Ubuntu Xenial repo is out of date and fails by throwing the following error 
> when attempting to install pylint:
> "You are using pip version 8.1.1, however version 10.0.1 is available"
> The following patch fixes this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548039#comment-16548039
 ] 

Eric Yang edited comment on HADOOP-15610 at 7/18/18 4:29 PM:
-

[~jackbearden] Thank you for the patch.  This patch can pass pylint error, but 
compilation of pylint shows some test compilation errors.

{code}
  Compiling 
/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python2/data/invalid_encoding.py
 ...
File 
"/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python2/data/invalid_encoding.py",
 line 0
  SyntaxError: unknown encoding: lala
  
  Compiling 
/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python3/data/invalid_encoding.py
 ...
File 
"/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python3/data/invalid_encoding.py",
 line 0
  SyntaxError: unknown encoding: lala
  

  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_class_instantiated_py3.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_class_instantiated_py3.py",
 line 14
  class GoodClass(object, metaclass=abc.ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_method_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_method_py3.py", 
line 35
  class Structure(object, metaclass=abc.ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/arguments_differ_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/arguments_differ_py3.py", 
line 4
  def kwonly_1(self, first, *, second, third):
 ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/async_functions.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/async_functions.py", line 5
  async def next(): # [redefined-builtin]
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_continuation_py36.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_continuation_py36.py", 
line 3
  async def upload_post(
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_except_order.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_except_order.py", line 
27
  __revision__ += 1
  SyntaxError: default 'except:' must be last
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_exception_context.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_exception_context.py", 
line 14
  raise IndexError from 1 # [bad-exception-context]
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bugfix_local_scope_metaclass_1177.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bugfix_local_scope_metaclass_1177.py",
 line 9
  class Class(metaclass=Meta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/class_members_py30.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/class_members_py30.py", 
line 34
  class TestMetaclass(object, metaclass=ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/continue_in_finally.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/continue_in_finally.py", 
line 9
  continue # [continue-in-finally]
  SyntaxError: 'continue' not supported inside 'finally' clause
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/disable_msg_github_issue_1389.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/disable_msg_github_issue_1389.py",
 line 10
  place: PlaceId
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/duplicate_argument_name.py 
...
  SyntaxError: duplicate argument '_' in function definition 
(duplicate_argument_name.py, line 4)
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/exec_used_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/exec_used_py3.py", line 4
  exec('a = 1', globals={}) # [exec-used]
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/formatted_string_literal_with_if_py36.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/formatted_string_literal_with_if_py36.py",
 line 4
  f'{"+" if True else "-"}'
  ^
  

[jira] [Commented] (HADOOP-15610) Hadoop Docker Image Pip Install Fails

2018-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548039#comment-16548039
 ] 

Eric Yang commented on HADOOP-15610:


[~jackbearden] Thank you for the patch.  This patch can pass pylint error, but 
compilation of pylint shows some test compilation errors.

{code}
  Compiling 
/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python2/data/invalid_encoding.py
 ...
File 
"/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python2/data/invalid_encoding.py",
 line 0
  SyntaxError: unknown encoding: lala
  
  Compiling 
/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python3/data/invalid_encoding.py
 ...
File 
"/tmp/pip-build-TXYLMB/astroid/astroid/tests/testdata/python3/data/invalid_encoding.py",
 line 0
  SyntaxError: unknown encoding: lala
  

  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_class_instantiated_py3.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_class_instantiated_py3.py",
 line 14
  class GoodClass(object, metaclass=abc.ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_method_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/abstract_method_py3.py", 
line 35
  class Structure(object, metaclass=abc.ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/arguments_differ_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/arguments_differ_py3.py", 
line 4
  def kwonly_1(self, first, *, second, third):
 ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/async_functions.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/async_functions.py", line 5
  async def next(): # [redefined-builtin]
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_continuation_py36.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_continuation_py36.py", 
line 3
  async def upload_post(
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_except_order.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_except_order.py", line 
27
  __revision__ += 1
  SyntaxError: default 'except:' must be last
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_exception_context.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bad_exception_context.py", 
line 14
  raise IndexError from 1 # [bad-exception-context]
  ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bugfix_local_scope_metaclass_1177.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/bugfix_local_scope_metaclass_1177.py",
 line 9
  class Class(metaclass=Meta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/class_members_py30.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/class_members_py30.py", 
line 34
  class TestMetaclass(object, metaclass=ABCMeta):
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/continue_in_finally.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/continue_in_finally.py", 
line 9
  continue # [continue-in-finally]
  SyntaxError: 'continue' not supported inside 'finally' clause
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/disable_msg_github_issue_1389.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/disable_msg_github_issue_1389.py",
 line 10
  place: PlaceId
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/duplicate_argument_name.py 
...
  SyntaxError: duplicate argument '_' in function definition 
(duplicate_argument_name.py, line 4)
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/exec_used_py3.py ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/exec_used_py3.py", line 4
  exec('a = 1', globals={}) # [exec-used]
   ^
  SyntaxError: invalid syntax
  
  Compiling 
/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/formatted_string_literal_with_if_py36.py
 ...
File 
"/tmp/pip-build-TXYLMB/pylint/pylint/test/functional/formatted_string_literal_with_if_py36.py",
 line 4
  f'{"+" if True else "-"}'
  ^
  SyntaxError: invalid syntax
  
  Compiling 

[jira] [Created] (HADOOP-15601) Change yarn.admin.acl setting to be more restricted

2018-07-11 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15601:
--

 Summary: Change yarn.admin.acl setting to be more restricted
 Key: HADOOP-15601
 URL: https://issues.apache.org/jira/browse/HADOOP-15601
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Reporter: Eric Yang


Yarn.admin.acl is default to *, which means everyone is yarn administrator by 
default.  It is probably better to default yarn.admin.acl to the user who runs 
yarn framework to prevent attacks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15600) Set default proxy user settings to non-routable IP addresses and default users group

2018-07-11 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15600:
--

 Summary: Set default proxy user settings to non-routable IP 
addresses and default users group
 Key: HADOOP-15600
 URL: https://issues.apache.org/jira/browse/HADOOP-15600
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Reporter: Eric Yang


The default setting to restrict the cluster nodes to communicate with peer 
nodes are controlled by: hadoop.proxyuser.[hdfs.yarn].hosts, and 
hadoop.proxyuser.[hdfs|yarn].groups.  These settings are default to be opened 
which allows any hosts to impersonate any user.

The proposal is to default settings to:

{code}

  hadoop.proxyuser.hdfs.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.hdfs.groups
  users



  hadoop.proxyuser.yarn.hosts
  
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16



  hadoop.proxyuser.yarn.groups
  users

{code}

This will allow the cluster to default to a closed network and default "users" 
group to reduce risks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15597) UserGroupInformation class throws NPE when Kerberos TGT expired

2018-07-10 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang resolved HADOOP-15597.

Resolution: Duplicate

> UserGroupInformation class throws NPE when Kerberos TGT expired
> ---
>
> Key: HADOOP-15597
> URL: https://issues.apache.org/jira/browse/HADOOP-15597
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.9.0, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 3.0.2, 2.9.2
>Reporter: Eric Yang
>Priority: Critical
>
> UserGroupInformation class throws NPE RuntimeException when tgt renewer can 
> not determine expiration time:
> {code}
> Thread Thread[TGT Renewer for rm/host1.example@example.com,5,main] threw 
> an Exception.
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The error occurs when Hadoop daemon processes uses UGI class to do service 
> tgt renewal.  The code is written that reattachMetrics() must be called by 
> the main program to initialize UGI metrics.  Metrics initialization is only 
> called by resource manager.  Other Hadoop processes do not call 
> reattachMetrics().  The runtime exception could cause interruption to Hadoop 
> services as observed in YARN RegistryDNS (YARN-8514).  It would be nice if 
> metrics initialization happens in UGI class without reliance on Hadoop 
> program to make UGI metrics initialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15597) UserGroupInformation class throws NPE when Kerberos TGT expired

2018-07-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539322#comment-16539322
 ] 

Eric Yang commented on HADOOP-15597:


[~xyao] Yes, this is a dupe.  Close this one.  Thank you for spotting this.

> UserGroupInformation class throws NPE when Kerberos TGT expired
> ---
>
> Key: HADOOP-15597
> URL: https://issues.apache.org/jira/browse/HADOOP-15597
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.9.0, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 3.0.2, 2.9.2
>Reporter: Eric Yang
>Priority: Critical
>
> UserGroupInformation class throws NPE RuntimeException when tgt renewer can 
> not determine expiration time:
> {code}
> Thread Thread[TGT Renewer for rm/host1.example@example.com,5,main] threw 
> an Exception.
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The error occurs when Hadoop daemon processes uses UGI class to do service 
> tgt renewal.  The code is written that reattachMetrics() must be called by 
> the main program to initialize UGI metrics.  Metrics initialization is only 
> called by resource manager.  Other Hadoop processes do not call 
> reattachMetrics().  The runtime exception could cause interruption to Hadoop 
> services as observed in YARN RegistryDNS (YARN-8514).  It would be nice if 
> metrics initialization happens in UGI class without reliance on Hadoop 
> program to make UGI metrics initialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15597) UserGroupInformation class throws NPE when Kerberos TGT expired

2018-07-10 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15597:
--

 Summary: UserGroupInformation class throws NPE when Kerberos TGT 
expired
 Key: HADOOP-15597
 URL: https://issues.apache.org/jira/browse/HADOOP-15597
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.2, 3.0.1, 2.9.1, 3.1.0, 3.0.0, 2.9.0, 2.9.2
Reporter: Eric Yang


UserGroupInformation class throws NPE RuntimeException when tgt renewer can not 
determine expiration time:

{code}
Thread Thread[TGT Renewer for rm/host1.example@example.com,5,main] threw an 
Exception.

java.lang.NullPointerException
at 
javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
at 
org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
at java.lang.Thread.run(Thread.java:745)
{code}

The error occurs when Hadoop daemon processes uses UGI class to do service tgt 
renewal.  The code is written that reattachMetrics() must be called by the main 
program to initialize UGI metrics.  Metrics initialization is only called by 
resource manager.  Other Hadoop processes do not call reattachMetrics().  The 
runtime exception could cause interruption to Hadoop services as observed in 
YARN RegistryDNS (YARN-8514).  It would be nice if metrics initialization 
happens in UGI class without reliance on Hadoop program to make UGI metrics 
initialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15588) Add proxy acl check for AuthenticationFilter

2018-07-09 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15588:
--

 Summary: Add proxy acl check for AuthenticationFilter
 Key: HADOOP-15588
 URL: https://issues.apache.org/jira/browse/HADOOP-15588
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: common
Reporter: Eric Yang


It would be nice if AuthenticationFilter can check proxy user and proxy hosts 
setting.  This helps to determine if the user is coming from an authorized 
remote server.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15587) Securing ASF Hadoop releases out of the box

2018-07-09 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15587:
--

 Summary: Securing ASF Hadoop releases out of the box
 Key: HADOOP-15587
 URL: https://issues.apache.org/jira/browse/HADOOP-15587
 Project: Hadoop Common
  Issue Type: Wish
  Components: build, common, documentation
Reporter: Eric Yang


[Mail 
thread|http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201807.mbox/%3cdc06cefa-fe2b-4ca3-b9a9-1d6df0421...@hortonworks.com%3E]
 started by Steve Loughran on the mailing lists to change default Hadoop 
release to be more secure, a list of improvements to include:
 # Change default proxy acl settings to non-routable IPs.
 # Implement proxy acl check for HTTP protocol.
 # Change yarn.admin.acl setting to be more restricted.
 # Review settings that need to be lock down by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517359#comment-16517359
 ] 

Eric Yang commented on HADOOP-15527:


[~xiaochen] Good catch, sorry about the wrong location.  This has been fixed in 
trunk and branch-3.1.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516130#comment-16516130
 ] 

Eric Yang commented on HADOOP-15527:


[~ajisakaa] Thanks for catching the mistake.  Process_with_sigterm_trap.sh has 
been committed to trunk and branch-3.1.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-12 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510496#comment-16510496
 ] 

Eric Yang commented on HADOOP-15527:


Jenkins failure is due to protoc version mismatch, not caused by the commit.

Thank you [~vinodkv] for the patch.

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-12 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15527:
---
   Resolution: Fixed
Fix Version/s: 3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-12 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510466#comment-16510466
 ] 

Eric Yang commented on HADOOP-15527:


+1 looks good to me.

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-12 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510151#comment-16510151
 ] 

Eric Yang commented on HADOOP-15527:


{code}
timeout=$(printf "%.0f\n" ${timeout})
{code}

Can be simplified to:

{code}
timeout=$((0 + ${timeout}))
{code}

To avoid shellcheck warning.


> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-12 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509764#comment-16509764
 ] 

Eric Yang commented on HADOOP-15527:


In the batch file, hadoop_stop_daemon is renamed to 
hadoop_stop_daemon_changing_pid.  Is this change necessary?

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Attachments: HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) Sometimes daemons keep running even after "kill -9" from daemon-stop script

2018-06-11 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509044#comment-16509044
 ] 

Eric Yang commented on HADOOP-15527:


In JDK 8, there is a new feature to control OS processes, notably 
destroyForcibly method.  However, this tooling is somewhat OS dependent.  It is 
best effort to terminate child processes.  This could leave dangling child 
processes around until the child processes are notified of parent process is 
shutting down.   When kill -9 is executed, ps -p output may still contain the 
list of child threads and this is mistaken for parent process is still alive.

Java 9 has another set of improvement around this area, which has a blog 
written for [process 
handling|https://javax0.wordpress.com/2017/07/19/process-handling-in-java-9/].  
That might improve the child process handling.  For Hadoop shell script 
improvement, we probably want to make sure that child thread is not listed for 
ps -p or use -f /proc/[pid] to identify the liveness of the process, and 
implement a loop for the check to ensure the process is gone before script exit.

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-11 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508318#comment-16508318
 ] 

Eric Yang commented on HADOOP-15518:


[~sunilg] Multiple AuthenticationFilters configured with different service 
principal names is a corner case that shouldn't exist but the code is allowing 
this to happen.  Comments in this JIRA and YARN-8108 should explain why this is 
unsupported use case.

The casting problem is using the same HTTP principal and YARN code is 
activating multiple filters that based on AuthenticationFilter.  Token casting 
issue didn't exist prior to this patch.  This patch is making assumption that 
filters based on AuthenticationFilter would make compatible tokens, which 
RMAuthenticationFilter and AuthenticationFilter don't make the same type of 
token.  Thus, the casting problem occurs.  This problem can be eliminated by 
applying same type of AuthenticationFilter on a server port.  YARN-8108 can fix 
YARN resource manager.  There might be other places in Hadoop that might have 
similar problems, like KMSAuthenticationFilter and 
DelegationTokenAuthenticationFilter that need to be reviewed to understand the 
impact of this change.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred in the current request.  This 
> primarily affects situations where multiple authentication mechanism has been 
> configured.  For example when core-site.xml's has 
> hadoop.http.authentication.type=kerberos and yarn-site.xml has 
> yarn.timeline-service.http-authentication.type=kerberos the result is an 
> attempt to perform two Kerberos authentications for the same request.  This 
> in turn results in Kerberos triggering a replay attack detection.  The 
> javadocs for AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the case in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506669#comment-16506669
 ] 

Eric Yang commented on HADOOP-15518:


There seems to be a problem if multiple filters extended from 
AuthenticationFilter, the token casting are incompatible.

Browser shows this error message:
{code}
HTTP ERROR 500

Problem accessing /proxy/application_1528498597648_0001/. Reason:

Server Error
Caused by:

java.lang.ClassCastException: 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter$1$1
 cannot be cast to 
org.apache.hadoop.security.authentication.server.AuthenticationToken
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:250)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:597)
at 
org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:649)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:597)
at 
org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.apache.hadoop.security.http.CrossOriginFilter.doFilter(CrossOriginFilter.java:98)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1608)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
{code}

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: 

[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506506#comment-16506506
 ] 

Eric Yang commented on HADOOP-15518:


One foot note about this change.  If multiple AuthenticationFilters are 
configured, and service principal names are different.  TGS granted to remote 
client is the principal name of the first AuthenticationFilter that gets 
triggered.  This may look unexpected when auditing where user has been through 
klist.  The ability to configure different HTTP principals on the same server 
port configuration shouldn't exist, but developer should be aware of the API 
imperfection to avoid getting to this hole.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred in the current request.  This 
> primarily affects situations where multiple authentication mechanism has been 
> configured.  For example when core-site.xml's has 
> hadoop.http.authentication.type=kerberos and yarn-site.xml has 
> yarn.timeline-service.http-authentication.type=kerberos the result is an 
> attempt to perform two Kerberos authentications for the same request.  This 
> in turn results in Kerberos triggering a replay attack detection.  The 
> javadocs for AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the case in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-08 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15518:
---
Status: Patch Available  (was: Open)

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred in the current request.  This 
> primarily affects situations where multiple authentication mechanism has been 
> configured.  For example when core-site.xml's has 
> hadoop.http.authentication.type=kerberos and yarn-site.xml has 
> yarn.timeline-service.http-authentication.type=kerberos the result is an 
> attempt to perform two Kerberos authentications for the same request.  This 
> in turn results in Kerberos triggering a replay attack detection.  The 
> javadocs for AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the case in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505081#comment-16505081
 ] 

Eric Yang commented on HADOOP-15518:


[~kminder] Race condition is not a criticism of this patch.  It is byproduct of 
having multiple instances of AuthenticationFilter that authenticate method is 
called more than once due to lack of checking that the current request is 
already authenticated.  My above comment is showing the states prior to this 
patch.  Thank you for the patch.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred in the current request.  This 
> primarily affects situations where multiple authentication mechanism has been 
> configured.  For example when core-site.xml's has 
> hadoop.http.authentication.type=kerberos and yarn-site.xml has 
> yarn.timeline-service.http-authentication.type=kerberos the result is an 
> attempt to perform two Kerberos authentications for the same request.  This 
> in turn results in Kerberos triggering a replay attack detection.  The 
> javadocs for AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the case in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504999#comment-16504999
 ] 

Eric Yang commented on HADOOP-15518:


[~lmccay] {quote}it is unclear to me where the race condition is. My assumption 
is that the filters are invoked linearly so a second filter shouldn't be 
invoked until the request state is set properly.{quote}

Your assumption is correct, and the diagram might help to explain the race 
conditions.

HIgh level of the sequence of events of a normal setup:

| Time | Browser | HttpRequest | HttpResponse |
| 1 | Send WWW-Authenticate 1 | | |
| 2 | | AuthenticationFilter checks WWW-Authenticate 1 |  |
| 3 | | Call authenticate to verify WWW-Authenticate 1 ticket with KDC | |
| 4 | | Set User principal and remote user via Java security callbacks | |
| 5 | | | AuthenticationFilter writes WWW-Authenticate 2 |
| 6 | | | Business logic |
| 7 | Received WWW-Authenticate 2 | | |

Events of duplicated AuthenticationFilters are:

| Time | Browser | HttpRequest | HttpResponse |
| 1 | WWW-Authenticate 1 | | |
| 2 | | AuthenticationFilter Instance 1 Check WWW-Authenticate 1 |  |
| 3 | | Call authenticate to verify WWW-Authenticate 1 ticket with KDC | |
| 4 | | Set User principal and remote user via Java security callbacks | |
| 5 | | | AuthenticationFilter Instance 1 writes WWW-Authenticate 2 |
| 6 | | AuthenticationFilter Instance 2 Check WWW-Autnenticate 1 | 
AuthenticationFilter Instance 2 rewrites HTTP status with 403 |

Browser has never retrieved WWW-Authenticate 2 header from server because 
HttpResponse is still buffered on server side.  The race condition is between 
HttpRequest at time 6 is still using existing ticket 1 without using the new 
ticket 2 that is issued at time 5.  Second Filter is invoked at time 6 using 
out dated data.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Assignee: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred in the current request.  This 
> primarily affects situations where multiple authentication mechanism has been 
> configured.  For example when core-site.xml's has 
> hadoop.http.authentication.type=kerberos and yarn-site.xml has 
> yarn.timeline-service.http-authentication.type=kerberos the result is an 
> attempt to perform two Kerberos authentications for the same request.  This 
> in turn results in Kerberos triggering a replay attack detection.  The 
> javadocs for AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the case in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504881#comment-16504881
 ] 

Eric Yang commented on HADOOP-15518:


+1 The patch looks good to me.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred.  This primarily affects situations 
> where multipole authentication mechanism has been configured.  For example 
> when core-site.xml's has hadoop.http.authentication.type and yarn-site.xml 
> has yarn.timeline-service.http-authentication.type the result is an attempt 
> to perform two Kerberos authentications for the same request.  This in turn 
> results in Kerberos triggering a replay attack detection.  The javadocs for 
> AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the cause in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15518) Authentication filter calling handler after request already authenticated

2018-06-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504877#comment-16504877
 ] 

Eric Yang commented on HADOOP-15518:


This is a race condition if multiple instances of AuthenticationFilters are 
chained together by accident, and the token has already been checked once by 
first instance of AuthenticationFilter, and the token has not been committed to 
HttpResponse.  "authenticate" method will get called twice.  The javadoc is not 
wrong, except how to go about validating the current request is already 
authenticated is not trivial when token has not been received by browser yet.  
It might be possible to add checks for either HttpRequest.getRemoteUser() or 
HttpResponse header committing state to see if Authentication had already 
occurred to prevent the race condition.

> Authentication filter calling handler after request already authenticated
> -
>
> Key: HADOOP-15518
> URL: https://issues.apache.org/jira/browse/HADOOP-15518
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Kevin Minder
>Priority: Major
> Attachments: HADOOP-15518-001.patch
>
>
> The hadoop-auth AuthenticationFilter will invoke its handler even if a prior 
> successful authentication has occurred.  This primarily affects situations 
> where multipole authentication mechanism has been configured.  For example 
> when core-site.xml's has hadoop.http.authentication.type and yarn-site.xml 
> has yarn.timeline-service.http-authentication.type the result is an attempt 
> to perform two Kerberos authentications for the same request.  This in turn 
> results in Kerberos triggering a replay attack detection.  The javadocs for 
> AuthenticationHandler 
> ([https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationHandler.java)]
>  indicate for the authenticate method that
> {quote}This method is invoked by the AuthenticationFilter only if the HTTP 
> client request is not yet authenticated.
> {quote}
> This does not appear to be the cause in practice.
> I've create a patch and tested on a limited number of functional use cases 
> (e.g. the timeline-service issue noted above).  If there is general agreement 
> that the change is valid I'll add unit tests to the patch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15284) Could not determine real path of mount

2018-03-02 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384323#comment-16384323
 ] 

Eric Yang commented on HADOOP-15284:


Filecache might not exist, if the user has never ran any job that deposit file 
in the file cache directory.

> Could not determine real path of mount
> --
>
> Key: HADOOP-15284
> URL: https://issues.apache.org/jira/browse/HADOOP-15284
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Yang
>Priority: Major
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15284) Could not determine real path of mount

2018-03-02 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15284:
--

 Summary: Could not determine real path of mount
 Key: HADOOP-15284
 URL: https://issues.apache.org/jira/browse/HADOOP-15284
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Eric Yang


Docker container is failing to launch in trunk.  The root cause is:

{code}
[COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: [2018-03-02 
23:26:09.196]Exception from container-launch.
Container id: container_1520032931921_0001_01_20
Exit code: 29
Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
Could not determine real path of mount 
'/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
Could not determine real path of mount 
'/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
Invalid docker mount 
'/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
 realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
Error constructing docker command, docker error code=12, error message='Invalid 
docker mount'
Shell output: main : command provided 4
main : run as user is hbase
main : requested yarn user is hbase
Creating script paths...
Creating local dirs...
[2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
23:26:09.240]
[2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
[2018-03-02 23:26:39.278]Could not find 
nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
 in any of the directories
[COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
now...
{code}

The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13119) Add ability to secure log servlet using proxy users

2018-02-28 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-13119:
---
Hadoop Flags: Incompatible change

> Add ability to secure log servlet using proxy users
> ---
>
> Key: HADOOP-13119
> URL: https://issues.apache.org/jira/browse/HADOOP-13119
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.4
>Reporter: Jeffrey E  Rodriguez
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: security
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HADOOP-13119.001.patch, HADOOP-13119.002.patch, 
> HADOOP-13119.003.patch, HADOOP-13119.004.patch, HADOOP-13119.005.patch, 
> HADOOP-13119.005.patch, screenshot-1.png
>
>
> User Hadoop on secure mode.
> login as kdc user, kinit.
> start firefox and enable Kerberos
> access http://localhost:50070/logs/
> Get 403 authorization errors.
> only hdfs user could access logs.
> Would expect as a user to be able to web interface logs link.
> Same results if using curl:
> curl -v  --negotiate -u tester:  http://localhost:50070/logs/
>  HTTP/1.1 403 User tester is unauthorized to access this page.
> so:
> 1. either don't show links if hdfs user  is able to access.
> 2. provide mechanism to add users to web application realm.
> 3. note that we are pass authentication so the issue is authorization to 
> /logs/
> suspect that /logs/ path is secure in webdescriptor so suspect users by 
> default don't have access to secure paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-27 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379563#comment-16379563
 ] 

Eric Yang edited comment on HADOOP-15222 at 2/28/18 1:12 AM:
-

Today, hadoop offers two roles, cluster admin, and normal users.  New system 
monitor role might be required for separation of duty for service hosting 
companies.  The following table shows a rough sketch of roles required to map 
to Hadoop web applications:

HDFS
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |
| /stacks | system monitor |

YARN
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |

This separation will prevent leaks of customer information.


was (Author: eyang):
Today, hadoop offers two roles, cluster admin, and normal users.  New system 
admin roles might be required for separation of duty for service hosting 
companies.  The following table shows a rough sketch of roles required to map 
to Hadoop web applications:

HDFS
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |
| /stacks | system monitor |

YARN
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |

This separation will prevent leaks of customer information.

> Refine proxy user authorization to support multiple ACL list
> 
>
> Key: HADOOP-15222
> URL: https://issues.apache.org/jira/browse/HADOOP-15222
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> This Jira is responding to follow up work for HADOOP-14077.  The original 
> goal of HADOOP-14077 is to have ability to support multiple ACL lists.  The 
> original problem is a separation of duty use case where the Hadoop cluster 
> hosting company monitors Hadoop cluster through jmx.  Application logs and 
> hdfs contents should not be visible to hosting company system administrators. 
>  When checking for proxy user authorization in AuthenticationFilter to ensure 
> there is a way to authorize normal users and admin users using separate proxy 
> users ACL lists.  This was suggested in HADOOP-14060 to configure 
> AuthenticationFilterWithProxyUser this way:
> AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser
> This enables the second AuthenticationFilterWithProxyUser validates both 
> credentials claim by proxy user, and end user.
> However, there is a side effect that unauthorized users are not properly 
> rejected with 403 FORBIDDEN message if there is no other web filter 
> configured to handle the required authorization work.
> This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
> StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
> AuthorizationFilterWithProxyUser as a final filter to evict unauthorized 
> user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false 
> positive in user authorization and impersonation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-27 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379563#comment-16379563
 ] 

Eric Yang commented on HADOOP-15222:


Today, hadoop offers two roles, cluster admin, and normal users.  New system 
admin roles might be required for separation of duty for service hosting 
companies.  The following table shows a rough sketch of roles required to map 
to Hadoop web applications:

HDFS
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |
| /stacks | system monitor |

YARN
| /logs | cluster admin |
| /jmx | system monitor |
| /conf | cluster admin |

This separation will prevent leaks of customer information.

> Refine proxy user authorization to support multiple ACL list
> 
>
> Key: HADOOP-15222
> URL: https://issues.apache.org/jira/browse/HADOOP-15222
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> This Jira is responding to follow up work for HADOOP-14077.  The original 
> goal of HADOOP-14077 is to have ability to support multiple ACL lists.  The 
> original problem is a separation of duty use case where the Hadoop cluster 
> hosting company monitors Hadoop cluster through jmx.  Application logs and 
> hdfs contents should not be visible to hosting company system administrators. 
>  When checking for proxy user authorization in AuthenticationFilter to ensure 
> there is a way to authorize normal users and admin users using separate proxy 
> users ACL lists.  This was suggested in HADOOP-14060 to configure 
> AuthenticationFilterWithProxyUser this way:
> AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser
> This enables the second AuthenticationFilterWithProxyUser validates both 
> credentials claim by proxy user, and end user.
> However, there is a side effect that unauthorized users are not properly 
> rejected with 403 FORBIDDEN message if there is no other web filter 
> configured to handle the required authorization work.
> This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
> StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
> AuthorizationFilterWithProxyUser as a final filter to evict unauthorized 
> user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false 
> positive in user authorization and impersonation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-26 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15222:
---
Description: 
This Jira is responding to follow up work for HADOOP-14077.  The original goal 
of HADOOP-14077 is to have ability to support multiple ACL lists.  The original 
problem is a separation of duty use case where the Hadoop cluster hosting 
company monitors Hadoop cluster through jmx.  Application logs and hdfs 
contents should not be visible to hosting company system administrators.  When 
checking for proxy user authorization in AuthenticationFilter to ensure there 
is a way to authorize normal users and admin users using separate proxy users 
ACL lists.  This was suggested in HADOOP-14060 to configure 
AuthenticationFilterWithProxyUser this way:

AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser

This enables the second AuthenticationFilterWithProxyUser validates both 
credentials claim by proxy user, and end user.

However, there is a side effect that unauthorized users are not properly 
rejected with 403 FORBIDDEN message if there is no other web filter configured 
to handle the required authorization work.

This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
AuthorizationFilterWithProxyUser as a final filter to evict unauthorized user, 
or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false positive in 
user authorization and impersonation.

  was:
This Jira is responding to follow up work for HADOOP-14077.  The original goal 
of HADOOP-14077 is to have ability to support multiple ACL lists.  When 
checking for proxy user authorization in AuthenticationFilter to ensure there 
is a way to authorize normal users and admin users using separate proxy users 
ACL lists.  This was suggested in HADOOP-14060 to configure 
AuthenticationFilterWithProxyUser this way:

AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser

This enables the second AuthenticationFilterWithProxyUser validates both 
credentials claim by proxy user, and end user.

However, there is a side effect that unauthorized users are not properly 
rejected with 403 FORBIDDEN message if there is no other web filter configured 
to handle the required authorization work.

This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
AuthorizationFilterWithProxyUser as a final filter to evict unauthorized user, 
or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false positive in 
user authorization.


> Refine proxy user authorization to support multiple ACL list
> 
>
> Key: HADOOP-15222
> URL: https://issues.apache.org/jira/browse/HADOOP-15222
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> This Jira is responding to follow up work for HADOOP-14077.  The original 
> goal of HADOOP-14077 is to have ability to support multiple ACL lists.  The 
> original problem is a separation of duty use case where the Hadoop cluster 
> hosting company monitors Hadoop cluster through jmx.  Application logs and 
> hdfs contents should not be visible to hosting company system administrators. 
>  When checking for proxy user authorization in AuthenticationFilter to ensure 
> there is a way to authorize normal users and admin users using separate proxy 
> users ACL lists.  This was suggested in HADOOP-14060 to configure 
> AuthenticationFilterWithProxyUser this way:
> AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser
> This enables the second AuthenticationFilterWithProxyUser validates both 
> credentials claim by proxy user, and end user.
> However, there is a side effect that unauthorized users are not properly 
> rejected with 403 FORBIDDEN message if there is no other web filter 
> configured to handle the required authorization work.
> This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
> StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
> AuthorizationFilterWithProxyUser as a final filter to evict unauthorized 
> user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false 
> positive in user authorization and impersonation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14728) Configuring AuthenticationFilterInitializer throws IllegalArgumentException: Null user

2018-02-26 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377850#comment-16377850
 ] 

Eric Yang commented on HADOOP-14728:


Null is not introduced by HADOOP-13119, but HADOOP-14077.  
AuthorizationException will be thrown when proxy user kerberos ticket is not 
valid.

There are two conditions that null users are returned:
1.  guest user doesn't associate with any group in proxy user ACL.
2.  guest user is coming from an address that is not allowed in proxy user ACL.

HADOOP-14077 decided to return null  for enable additional filter chains to 
check other proxy ACL or other challenge/response filters that can take place.  
The purpose is to channel doAs users to check other proxy ACL list on demand.

> Configuring AuthenticationFilterInitializer throws IllegalArgumentException: 
> Null user
> --
>
> Key: HADOOP-14728
> URL: https://issues.apache.org/jira/browse/HADOOP-14728
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Krishna Pandey
>Priority: Major
> Attachments: HADOOP-14728.01.patch
>
>
> Configured AuthenticationFilterInitializer and started a cluster. When 
> accessing YARN UI using doAs, encountering following error. 
> URL : http://localhost:25005/cluster??doAs=guest
> {noformat}
> org.apache.hadoop.security.authentication.util.SignerException: Invalid 
> signature
> 2017-08-01 15:34:22,163 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster
> java.lang.IllegalArgumentException: Null user
>   at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1499)
>   at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1486)
>   at 
> org.apache.hadoop.security.AuthenticationWithProxyUserFilter$1.getRemoteOrProxyUser(AuthenticationWithProxyUserFilter.java:82)
>   at 
> org.apache.hadoop.security.AuthenticationWithProxyUserFilter$1.getRemoteUser(AuthenticationWithProxyUserFilter.java:92)
>   at 
> javax.servlet.http.HttpServletRequestWrapper.getRemoteUser(HttpServletRequestWrapper.java:207)
>   at 
> javax.servlet.http.HttpServletRequestWrapper.getRemoteUser(HttpServletRequestWrapper.java:207)
>   at 
> org.apache.hadoop.yarn.webapp.view.HeaderBlock.render(HeaderBlock.java:28)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>   at org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>   at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:61)
>   at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:206)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:165)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14077) Improve the patch of HADOOP-13119

2018-02-13 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reopened HADOOP-14077:


> Improve the patch of HADOOP-13119
> -
>
> Key: HADOOP-14077
> URL: https://issues.apache.org/jira/browse/HADOOP-14077
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
>Priority: Major
> Fix For: 3.0.0-alpha4
>
> Attachments: HADOOP-14077.001.patch, HADOOP-14077.002.patch, 
> HADOOP-14077.003.patch
>
>
> For some links(such as "/jmx, /stack"), blocking the links in filter chain 
> due to impersonation issue is not friendly for users. For example, user "sam" 
> is not allowed to be impersonated by user "knox", and the link "/jmx" doesn't 
> need any user to do authorization by default. It only needs user "knox" to do 
> authentication, in this case, it's not right to  block the access in SPNEGO 
> filter. We intend to check impersonation permission when the method 
> "getRemoteUser" of request is used, so that such kind of links("/jmx, 
> /stack") would not be blocked by mistake.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14077) Improve the patch of HADOOP-13119

2018-02-13 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362666#comment-16362666
 ] 

Eric Yang commented on HADOOP-14077:


[~yuanbo] Hadoop Security team has brought to my attention that this feature 
has potential to weaken security.  When user is not authorized in the first 
proxy user list, the Authorization exception is captured and return null.  This 
allows the second proxy list to be checked if user chain StaticUserWebFilter 
and another AuthenticationFilterWithProxyUser together per your comment in 
[HADOOP-14060|https://issues.apache.org/jira/browse/HADOOP-14060?focusedCommentId=15875737=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15875737].
  However, this procedure can trigger replay attack of using ProxyUser 
credential to fool other services because the end user credential is not 
authorized to use first proxy user in the first place.  Given this reason, I 
have no choice but revert this commit.  Sorry that I missed to spot the problem 
in the first round of review.  

When reverting this change, this may impact managed service, like the cluster 
system administrator and users are from two companies.  You may need to review 
if your clusters depend on this feature.

> Improve the patch of HADOOP-13119
> -
>
> Key: HADOOP-14077
> URL: https://issues.apache.org/jira/browse/HADOOP-14077
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
>Priority: Major
> Fix For: 3.0.0-alpha4
>
> Attachments: HADOOP-14077.001.patch, HADOOP-14077.002.patch, 
> HADOOP-14077.003.patch
>
>
> For some links(such as "/jmx, /stack"), blocking the links in filter chain 
> due to impersonation issue is not friendly for users. For example, user "sam" 
> is not allowed to be impersonated by user "knox", and the link "/jmx" doesn't 
> need any user to do authorization by default. It only needs user "knox" to do 
> authentication, in this case, it's not right to  block the access in SPNEGO 
> filter. We intend to check impersonation permission when the method 
> "getRemoteUser" of request is used, so that such kind of links("/jmx, 
> /stack") would not be blocked by mistake.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-12 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361799#comment-16361799
 ] 

Eric Yang commented on HADOOP-15222:


[~lmccay] Sorry, until a better proposal is feasible to secure /log and /jmx, 
there is no good enough reason to justify the revert of HADOOP-13119.  
[~arpitagarwal]'s report was not valid on HADOOP-13119, and HADOOP-13119 does 
provide better security for authorized users than anonymous to access /log.  I 
can not agree on the revert on HADOOP-13119 at this time.


> Refine proxy user authorization to support multiple ACL list
> 
>
> Key: HADOOP-15222
> URL: https://issues.apache.org/jira/browse/HADOOP-15222
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> This Jira is responding to follow up work for HADOOP-14077.  The original 
> goal of HADOOP-14077 is to have ability to support multiple ACL lists.  When 
> checking for proxy user authorization in AuthenticationFilter to ensure there 
> is a way to authorize normal users and admin users using separate proxy users 
> ACL lists.  This was suggested in HADOOP-14060 to configure 
> AuthenticationFilterWithProxyUser this way:
> AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser
> This enables the second AuthenticationFilterWithProxyUser validates both 
> credentials claim by proxy user, and end user.
> However, there is a side effect that unauthorized users are not properly 
> rejected with 403 FORBIDDEN message if there is no other web filter 
> configured to handle the required authorization work.
> This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
> StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
> AuthorizationFilterWithProxyUser as a final filter to evict unauthorized 
> user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false 
> positive in user authorization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13119) Add ability to secure log servlet using proxy users

2018-02-12 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361794#comment-16361794
 ] 

Eric Yang commented on HADOOP-13119:


[~arpitagarwal] the test case is invalid.  Your curl command does not contain 
--negotiate -u :, and Null user can only happen if HADOOP-14077 is applied.

> Add ability to secure log servlet using proxy users
> ---
>
> Key: HADOOP-13119
> URL: https://issues.apache.org/jira/browse/HADOOP-13119
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.4
>Reporter: Jeffrey E  Rodriguez
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: security
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HADOOP-13119.001.patch, HADOOP-13119.002.patch, 
> HADOOP-13119.003.patch, HADOOP-13119.004.patch, HADOOP-13119.005.patch, 
> HADOOP-13119.005.patch, screenshot-1.png
>
>
> User Hadoop on secure mode.
> login as kdc user, kinit.
> start firefox and enable Kerberos
> access http://localhost:50070/logs/
> Get 403 authorization errors.
> only hdfs user could access logs.
> Would expect as a user to be able to web interface logs link.
> Same results if using curl:
> curl -v  --negotiate -u tester:  http://localhost:50070/logs/
>  HTTP/1.1 403 User tester is unauthorized to access this page.
> so:
> 1. either don't show links if hdfs user  is able to access.
> 2. provide mechanism to add users to web application realm.
> 3. note that we are pass authentication so the issue is authorization to 
> /logs/
> suspect that /logs/ path is secure in webdescriptor so suspect users by 
> default don't have access to secure paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-12 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361731#comment-16361731
 ] 

Eric Yang commented on HADOOP-15222:


[~lmccay] Thank you for the summary.  This is aligned with the original problem 
statement.  Role based ACL in standard J2EE web application would be the right 
approach to solve the authorization problem.  User can describe in web.xml 
which url resource are allowed by roles.  Roles are mapped to groups of users.  
It would be nice to do the same in Hadoop.  Hadoop web applications don't quite 
follow J2EE design pattern.  This made the problem hard to solve for Hadoop.  
We can start by turning Hadoop jetty Java code back to configuration, and maps 
to roles.  In doing so, we might finish in 2-3 years of hard labour.  There 
might be better ways to resolve this issue that we need to explore.

HADOOP-13119 is back ported to Hadoop 2.8.x as a new feature in Hadoop 2.8.  Do 
we revert HADOOP-13119 from 2.8.x or we keep HADOOP-13119 as the temp solution 
until the new work is completed?

> Refine proxy user authorization to support multiple ACL list
> 
>
> Key: HADOOP-15222
> URL: https://issues.apache.org/jira/browse/HADOOP-15222
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> This Jira is responding to follow up work for HADOOP-14077.  The original 
> goal of HADOOP-14077 is to have ability to support multiple ACL lists.  When 
> checking for proxy user authorization in AuthenticationFilter to ensure there 
> is a way to authorize normal users and admin users using separate proxy users 
> ACL lists.  This was suggested in HADOOP-14060 to configure 
> AuthenticationFilterWithProxyUser this way:
> AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser
> This enables the second AuthenticationFilterWithProxyUser validates both 
> credentials claim by proxy user, and end user.
> However, there is a side effect that unauthorized users are not properly 
> rejected with 403 FORBIDDEN message if there is no other web filter 
> configured to handle the required authorization work.
> This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
> StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
> AuthorizationFilterWithProxyUser as a final filter to evict unauthorized 
> user, or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false 
> positive in user authorization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15222) Refine proxy user authorization to support multiple ACL list

2018-02-12 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15222:
--

 Summary: Refine proxy user authorization to support multiple ACL 
list
 Key: HADOOP-15222
 URL: https://issues.apache.org/jira/browse/HADOOP-15222
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Eric Yang


This Jira is responding to follow up work for HADOOP-14077.  The original goal 
of HADOOP-14077 is to have ability to support multiple ACL lists.  When 
checking for proxy user authorization in AuthenticationFilter to ensure there 
is a way to authorize normal users and admin users using separate proxy users 
ACL lists.  This was suggested in HADOOP-14060 to configure 
AuthenticationFilterWithProxyUser this way:

AuthenticationFilterWithProxyUser->StaticUserWebFilter->AuthenticationFIlterWithProxyUser

This enables the second AuthenticationFilterWithProxyUser validates both 
credentials claim by proxy user, and end user.

However, there is a side effect that unauthorized users are not properly 
rejected with 403 FORBIDDEN message if there is no other web filter configured 
to handle the required authorization work.

This JIRA is intend to discuss the work of HADOOP-14077 by either combine 
StaticUserWebFilter + second AuthenticationFilterWithProxyUser into a 
AuthorizationFilterWithProxyUser as a final filter to evict unauthorized user, 
or revert both HADOOP-14077 and HADOOP-13119 to eliminate the false positive in 
user authorization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-10 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang resolved HADOOP-15162.

Resolution: Not A Problem

Close this as not a problem.  Bad assumption for SIMPLE security mode doesn't 
check for proxy ACL.  I verified that SIMPLE security mode also checks for 
proxy ACL.  UGI.createRemoteUser(remoteUser) has no effect to proxy ACL check.  
Thanks to [~jlowe] and [~daryn] for advices and recommendations.

> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without proxyuser check for both SIMPLE or 
> Kerberos enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-09 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16318749#comment-16318749
 ] 

Eric Yang commented on HADOOP-15162:


[~daryn] {quote}
Are you writing your own custom http server and authentication filter?
{quote}

No.  This JIRA serves the purpose to provide information for less experienced 
developer to understand proxy ACL must be verified to enable perimeter 
security.  Code written as:

{code}
proxyUser = UserGroupInformation.getLoginUser();
ugi = UserGroupInformation
.createProxyUser(remoteUser, proxyUser);
{code}

Without using UGI.createRemoteUser(remoteUser) is equally good.  There is no 
need of isSecurityEnabled() check, and there is no need of explicitly call 
UGI.createRemoteUser(remoteUser).  User only get to shoot themselves in the 
foot, if {{hadoop.http.authentication.simple.anonymous.allowed}} is 
misconfigured which allow anyone to impersonate as someone else.  I would 
propose to deprecate createRemoteUser(remoteUser) API because it creates 
confusion on how code should be written.

> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without proxyuser check for both SIMPLE or 
> Kerberos enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317583#comment-16317583
 ] 

Eric Yang commented on HADOOP-15162:


[~daryn] Thank you for your reply.  

{quote}
Based on the snippets of code that conclude with "if authentication are in 
place, server side code can be simplified to [...] 
UserGroupInformation.createRemoteUser(remoteUser);", I think you are suggesting 
that createRemote should auto-magically create a proxy user with the login 
user? If you say yes, I'll provide a litany of reasons why that'd be completely 
broken. If no, please more concisely state your use case.{quote}

Proxy user credential should be verified if it can impersonate.  In my usage, I 
am writing a component for YARN, and end user credential is verified in http 
request.  If code is written as UGI.createRemoteUser(remoteUser), should there 
be a check to determine if the current service user can proxy?  Some Hadoop PMC 
told me no because they assumed isSecurityEnabled == false, there should be no 
proxy ACL check.  If this type of assumption is applied, then we will have 
components talking to other components without honoring proxy user ACL, and 
leading to part of Hadoop being completely insecure.  This is the reason that I 
think createRemoteUser default authentication method to SIMPLE is a bad 
practice.  The server should decide which authentication method to use, setup 
authentication method and verify proxy ACL explicitly.


> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without proxyuser check for both SIMPLE or 
> Kerberos enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317386#comment-16317386
 ] 

Eric Yang commented on HADOOP-15162:


In summary, proxy user ACL should be checked for simple security instead of 
reliance on isSecurityEnabled().  {{isSecurityEnabled()}} gives a false sense 
that proxy user ACL shouldn't be checked which leading to use of 
UserGroupInformation.createRemoteUser(remoteUser) in server code, which is a 
bad practice for not verifying the credential of current server user.  Is this 
something that need to be improved or we mark this as won't fix  and make sure 
people always use proper proxy user directive for server side code?

{code}
proxyUser = UserGroupInformation.getLoginUser();
ugi = UserGroupInformation
.createProxyUser(remoteUser, proxyUser);
{code}


> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without proxyuser check for both SIMPLE or 
> Kerberos enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316758#comment-16316758
 ] 

Eric Yang commented on HADOOP-15162:


Hi [~daryn],

{quote}
If you have a specific risk case, please take it up on the security list. Don't 
irresponsibly post publicly.
{quote}

There is no security hole yet if the cluster is deployed with Kerberos or 
isSecurityEnabled==true.  If I am disclosing a real security hole, then it will 
definitely have been sent to security mailing list first.  I do not think this 
issue is worthy of sounding the bell yet.  This has been known issues with 
SIMPLE security since Hadoop 0.20s release.  I am only observing code changes 
over the past couple years and some security holes are about to be opened up 
due to inexperience developers following incorrect discipline.  Without the 
proper information to educate the public, fear will only cause panic and 
prevent progress.  I hope you understand my intention is to mitigate the risk 
by disclosing information to lead to progress rather than fear to drive people 
away.

> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without proxyuser check for both SIMPLE or 
> Kerberos enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15162) UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE

2018-01-08 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316725#comment-16316725
 ] 

Eric Yang commented on HADOOP-15162:


Hi [~jlowe],  Webhdfs and YARN allows impersonation through usage of 
?user.name=foobar in HTTP URL.  This allows SIMPLE security mode to run as any 
other user without check.  If the cluster is configured with Linux Container 
Executor, then it can be carry out as a privileges escalation exploit in 
combination with vulnerability found in YARN-7590.

Trust and verify are very important processes to enforce authentication 
security.  Server side must do either username/password challenge or token 
validation to enforce security like you said.  In Hadoop implementation of 
SIMPLE security, there is no authentication challenge in form of 
username/password prompt or token challenge to client.  Therefore, Hadoop doAs 
call trusts everyone who claims to be someone else when using SIMPLE security.

The fix must happen to intercepting RPC or HTTP requests to add authentication 
challenge to enforce security.  createRemoteUser should be intelligent to know 
the preference of the authentication to apply to avoid security holes.  Without 
addressing these issues, we are encouraging developer to write code such as:

{code}
UserGroupInformation proxyUser;
UserGroupInformation ugi;
String remoteUser = request.getRemoteUser();
try {
  if (UserGroupInformation.isSecurityEnabled()) {
proxyUser = UserGroupInformation.getLoginUser();
ugi = UserGroupInformation
.createProxyUser(remoteUser, proxyUser);
  } else {
ugi = UserGroupInformation.createRemoteUser(remoteUser);
  }
  return ugi;
} catch (IOException e) {
  throw new AccessControlException(e.getCause());
}
{code}

If security is not enabled, allow proxy without checking for the current user 
is allowed to proxy.  Unfortunately, SIMPLE security is tight to no security 
due to improper interpretation in isSecurityEnabled method since Hadoop 0.20 
security releases.  

If authentication are in place, server side code can be simplified to:

{code}
UserGroupInformation proxyUser;
UserGroupInformation ugi;
String remoteUser = request.getRemoteUser();
try {
  ugi = UserGroupInformation.createRemoteUser(remoteUser);
  return ugi;
} catch (IOException e) {
  throw new AccessControlException(e.getCause());
}
{code}

createRemoteUser(String user), can get the current login user, then 
createProxyUser.  At minimum proxy user ACL list will be verified for simple 
security to set a security perimeter, and combined with authentication 
challenge to provide simple security.   

Client code can assign any arbitrary user, and trigger authentication challenge 
to occur when communicate with the server side.  This is happening when 
Kerberos security is enabled.  It would be nice if the same practice can apply 
to SIMPLE security without open up security holes regardless Kerberos security 
is enabled or not.

The very first design of Hadoop security was attempting to solve replay attack. 
 Kerberos security or a combination of proxy user/host ACL list can both serve 
the same purpose.  For some reason, during the implementation, Kerberos and 
proxy ACL list became only enforced when Kerberos security is enabled.  
Kerberos and proxy ACL are in fact redundant checks.  This left SIMPLE security 
to be completely open, no security and no proxy check.

If developers blindly utilized the current implementation of 
UserGroupInformation.createRemoteUser(remoteUser);, part of Hadoop can be 
opened up to run without any security without anyone knowing.  This is not good 
security practice, hence, we probably want to mitigate this risk by improving 
the logic in createRemoteUser and review if there is a need to revise SIMPLE 
security definition.

If isSecurityEnabled() is based on hadoop.security.authentication==null, then 
the implementation of HTTP basic + proxy ACL and Kerberos could be two methods 
that enforce security.  This provide a way to apply security measure on cloud 
without deploying Kerberos.

> UserGroupInformation.createRemoteUser hardcode authentication method to SIMPLE
> --
>
> Key: HADOOP-15162
> URL: https://issues.apache.org/jira/browse/HADOOP-15162
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Eric Yang
>
> {{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
> Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser 
> ACL check, isSecurityEnabled check, and allow caller to impersonate as 
> anyone.  This method could be abused in the main code base, which can cause 
> part of Hadoop to become insecure without 

[jira] [Created] (HADOOP-15162) UserGroupInformation.createRmoteUser hardcode authentication method to SIMPLE

2018-01-05 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-15162:
--

 Summary: UserGroupInformation.createRmoteUser hardcode 
authentication method to SIMPLE
 Key: HADOOP-15162
 URL: https://issues.apache.org/jira/browse/HADOOP-15162
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Eric Yang


{{UserGroupInformation.createRemoteUser(String user)}} is hard coded 
Authentication method to SIMPLE by HADOOP-10683.  This by passed proxyuser ACL 
check, isSecurityEnabled check, and allow caller to impersonate as anyone.  
This method could be abused in the main code base, which can cause part of 
Hadoop to become insecure without proxyuser check for both SIMPLE or Kerberos 
enabled environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15128) TestViewFileSystem tests are broken in trunk

2017-12-19 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296316#comment-16296316
 ] 

Eric Yang edited comment on HADOOP-15128 at 12/20/17 1:15 AM:
--

In HADOOP-2381, the discussion thread contains good insights why the original 
code was written as such.  We should throw exception when it is unable to load 
permission information, and owner, group, permission information should not be 
null.  These safe guards are making sure the file system security is not 
compromised.  The recent change allows other class to extend RawLocalFileSystem 
base class and circumvent security.  HADOOP-10054 must be reverted to prevent 
security hole.  There are other ways to fix ViewFsFileStatus.toString() to make 
sure the path is initialized properly without modifying FileStatus class.


was (Author: eyang):
In HADOOP-2381, the discussion thread contains good insights why the original 
code was written as such.  We should throw exception when it is unable to load 
permission information, and owner, group, permission information should not be 
null.  These safe guards are making sure the file system security is not 
compromised.  The recent change allows other class to extend RawLocalFileSystem 
base class and circumvent security.  HADOOP-10054 must be reverted to prevent 
security hole.  There are other ways to fix ViewFsFileStatus.toString() to make 
sure the path is initialized properly without modifying RawLocalFileSystem 
class.

> TestViewFileSystem tests are broken in trunk
> 
>
> Key: HADOOP-15128
> URL: https://issues.apache.org/jira/browse/HADOOP-15128
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Affects Versions: 3.1.0
>Reporter: Anu Engineer
>Assignee: Hanisha Koneru
>
> The fix in Hadoop-10054 seems to have caused a test failure. Please take a 
> look. Thanks [~eyang] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15128) TestViewFileSystem tests are broken in trunk

2017-12-18 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296316#comment-16296316
 ] 

Eric Yang commented on HADOOP-15128:


In HADOOP-2381, the discussion thread contains good insights why the original 
code was written as such.  We should throw exception when it is unable to load 
permission information, and owner, group, permission information should not be 
null.  These safe guards are making sure the file system security is not 
compromised.  The recent change allows other class to extend RawLocalFileSystem 
base class and circumvent security.  HADOOP-10054 must be reverted to prevent 
security hole.  There are other ways to fix ViewFsFileStatus.toString() to make 
sure the path is initialized properly without modifying RawLocalFileSystem 
class.

> TestViewFileSystem tests are broken in trunk
> 
>
> Key: HADOOP-15128
> URL: https://issues.apache.org/jira/browse/HADOOP-15128
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Affects Versions: 3.1.0
>Reporter: Anu Engineer
>Assignee: Hanisha Koneru
>
> The fix in Hadoop-10054 seems to have caused a test failure. Please take a 
> look. Thanks [~eyang] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-10054) ViewFsFileStatus.toString() is broken

2017-12-18 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reopened HADOOP-10054:


This patch broke trunk development.  Please run unit test:

{code}
mvn clean test -Dtest=TestViewFileSystemLocalFileSystem
{code}

> ViewFsFileStatus.toString() is broken
> -
>
> Key: HADOOP-10054
> URL: https://issues.apache.org/jira/browse/HADOOP-10054
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.0.5-alpha
>Reporter: Paul Han
>Assignee: Hanisha Koneru
>Priority: Minor
> Fix For: 3.0.1
>
> Attachments: HADOOP-10054.001.patch, HADOOP-10054.002.patch
>
>
> ViewFsFileStatus.toString is broken. Following code snippet :
> {code}
> FileStatus stat= somefunc(); // somefunc() returns an instance of 
> ViewFsFileStatus
> System.out.println("path:" + stat.getPath());
>   System.out.println(stat.toString());
> {code}
> produces the output:
> {code}
> path:viewfs://x.com/user/X/tmp-48
> ViewFsFileStatus{path=null; isDirectory=false; length=0; replication=0; 
> blocksize=0; modification_time=0; access_time=0; owner=; group=; 
> permission=rw-rw-rw-; isSymlink=false}
> {code}
> Note that "path=null" is not correct.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15008) Metrics sinks may emit too frequently if multiple sink periods are configured

2017-11-13 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249863#comment-16249863
 ] 

Eric Yang edited comment on HADOOP-15008 at 11/13/17 11:39 PM:
---

+1 Verified the test failure is not related to this patch.  The value output 
from different intervals seem to have correct values.


was (Author: eyang):
+1 Verified the test failure is not related to this patch.  The value output 
from different different intervals seem to have correct values.

> Metrics sinks may emit too frequently if multiple sink periods are configured
> -
>
> Key: HADOOP-15008
> URL: https://issues.apache.org/jira/browse/HADOOP-15008
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.2.0, 3.0.0-beta1
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HADOOP-15008.000.patch
>
>
> If there are multiple metrics sink periods configured, depending on what 
> those periods are, some sinks may emit too frequently. For example with the 
> following:
> {code:title=hadoop-metrics2.properties}
> namenode.sink.file10.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file5.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file10.filename=namenode-metrics_per10.out
> namenode.sink.file5.filename=namenode-metrics_per5.out
> namenode.sink.file10.period=10
> namenode.sink.file5.period=5
> {code}
> I get the following:
> {code}
> ± for f in namenode-metrics_per*.out; do echo "$f" && grep 
> "metricssystem.MetricsSystem" $f | awk '{last=curr; curr=$1} END { print 
> curr-last }'; done
> namenode-metrics_per10.out
> 5000
> namenode-metrics_per5.out
> 5000
> {code}
> i.e., for both metrics files, each record is 5000 ms apart, even though one 
> of the sinks has been configured to emit at 10s intervals



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15008) Metrics sinks may emit too frequently if multiple sink periods are configured

2017-11-13 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15008:
---
Affects Version/s: 2.2.0
   3.0.0-beta1
 Hadoop Flags: Incompatible change,Reviewed
Fix Version/s: 3.1.0

> Metrics sinks may emit too frequently if multiple sink periods are configured
> -
>
> Key: HADOOP-15008
> URL: https://issues.apache.org/jira/browse/HADOOP-15008
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.2.0, 3.0.0-beta1
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HADOOP-15008.000.patch
>
>
> If there are multiple metrics sink periods configured, depending on what 
> those periods are, some sinks may emit too frequently. For example with the 
> following:
> {code:title=hadoop-metrics2.properties}
> namenode.sink.file10.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file5.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file10.filename=namenode-metrics_per10.out
> namenode.sink.file5.filename=namenode-metrics_per5.out
> namenode.sink.file10.period=10
> namenode.sink.file5.period=5
> {code}
> I get the following:
> {code}
> ± for f in namenode-metrics_per*.out; do echo "$f" && grep 
> "metricssystem.MetricsSystem" $f | awk '{last=curr; curr=$1} END { print 
> curr-last }'; done
> namenode-metrics_per10.out
> 5000
> namenode-metrics_per5.out
> 5000
> {code}
> i.e., for both metrics files, each record is 5000 ms apart, even though one 
> of the sinks has been configured to emit at 10s intervals



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15008) Metrics sinks may emit too frequently if multiple sink periods are configured

2017-11-13 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HADOOP-15008:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this, Thank you Erik.

> Metrics sinks may emit too frequently if multiple sink periods are configured
> -
>
> Key: HADOOP-15008
> URL: https://issues.apache.org/jira/browse/HADOOP-15008
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.2.0, 3.0.0-beta1
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HADOOP-15008.000.patch
>
>
> If there are multiple metrics sink periods configured, depending on what 
> those periods are, some sinks may emit too frequently. For example with the 
> following:
> {code:title=hadoop-metrics2.properties}
> namenode.sink.file10.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file5.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file10.filename=namenode-metrics_per10.out
> namenode.sink.file5.filename=namenode-metrics_per5.out
> namenode.sink.file10.period=10
> namenode.sink.file5.period=5
> {code}
> I get the following:
> {code}
> ± for f in namenode-metrics_per*.out; do echo "$f" && grep 
> "metricssystem.MetricsSystem" $f | awk '{last=curr; curr=$1} END { print 
> curr-last }'; done
> namenode-metrics_per10.out
> 5000
> namenode-metrics_per5.out
> 5000
> {code}
> i.e., for both metrics files, each record is 5000 ms apart, even though one 
> of the sinks has been configured to emit at 10s intervals



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



<    1   2   3   4   5   6   7   8   >