[jira] [Commented] (KNOX-736) Support NIS Schema in Demo LDAP

2017-10-31 Thread Sandeep More (JIRA)

[ 
https://issues.apache.org/jira/browse/KNOX-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226850#comment-16226850
 ] 

Sandeep More commented on KNOX-736:
---

Hello [~hkropp] sorry for the delay in getting to this patch. It turns out as a 
part of KNOX-1088 refactoring Posix schema was enabled 
[see|https://github.com/apache/knox/blob/master/gateway-demo-ldap/src/main/java/org/apache/hadoop/gateway/security/ldap/SimpleLdapDirectoryServer.java#L93]
 

Let me know if you think this still needs to be fixed.

> Support NIS Schema in Demo LDAP
> ---
>
> Key: KNOX-736
> URL: https://issues.apache.org/jira/browse/KNOX-736
> Project: Apache Knox
>  Issue Type: Improvement
>Reporter: Henning Kropp
>Assignee: Henning Kropp
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: KNOX-736.diff
>
>
> With support of the NIS schema users in the directory could be created with 
> POSIX attributes. With POSIX attributes LDAP users could be mapped to the OS 
> level for authentication with PAM. This could support testing and 
> demonstrating the PAM authentication in Knox (KNOX-537)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KNOX-736) Support NIS Schema in Demo LDAP

2017-10-31 Thread Sandeep More (JIRA)

[ 
https://issues.apache.org/jira/browse/KNOX-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226850#comment-16226850
 ] 

Sandeep More edited comment on KNOX-736 at 10/31/17 2:13 PM:
-

Hello [~hkropp] sorry for the delay in getting to this patch. It turns out as a 
part of KNOX-1088 refactoring, posix schema was enabled 
[see|https://github.com/apache/knox/blob/master/gateway-demo-ldap/src/main/java/org/apache/hadoop/gateway/security/ldap/SimpleLdapDirectoryServer.java#L93]
 

Let me know if you think this still needs to be fixed.


was (Author: smore):
Hello [~hkropp] sorry for the delay in getting to this patch. It turns out as a 
part of KNOX-1088 refactoring, Posix schema was enabled 
[see|https://github.com/apache/knox/blob/master/gateway-demo-ldap/src/main/java/org/apache/hadoop/gateway/security/ldap/SimpleLdapDirectoryServer.java#L93]
 

Let me know if you think this still needs to be fixed.

> Support NIS Schema in Demo LDAP
> ---
>
> Key: KNOX-736
> URL: https://issues.apache.org/jira/browse/KNOX-736
> Project: Apache Knox
>  Issue Type: Improvement
>Reporter: Henning Kropp
>Assignee: Henning Kropp
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: KNOX-736.diff
>
>
> With support of the NIS schema users in the directory could be created with 
> POSIX attributes. With POSIX attributes LDAP users could be mapped to the OS 
> level for authentication with PAM. This could support testing and 
> demonstrating the PAM authentication in Knox (KNOX-537)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KNOX-736) Support NIS Schema in Demo LDAP

2017-10-31 Thread Sandeep More (JIRA)

[ 
https://issues.apache.org/jira/browse/KNOX-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226850#comment-16226850
 ] 

Sandeep More edited comment on KNOX-736 at 10/31/17 2:13 PM:
-

Hello [~hkropp] sorry for the delay in getting to this patch. It turns out as a 
part of KNOX-1088 refactoring, Posix schema was enabled 
[see|https://github.com/apache/knox/blob/master/gateway-demo-ldap/src/main/java/org/apache/hadoop/gateway/security/ldap/SimpleLdapDirectoryServer.java#L93]
 

Let me know if you think this still needs to be fixed.


was (Author: smore):
Hello [~hkropp] sorry for the delay in getting to this patch. It turns out as a 
part of KNOX-1088 refactoring Posix schema was enabled 
[see|https://github.com/apache/knox/blob/master/gateway-demo-ldap/src/main/java/org/apache/hadoop/gateway/security/ldap/SimpleLdapDirectoryServer.java#L93]
 

Let me know if you think this still needs to be fixed.

> Support NIS Schema in Demo LDAP
> ---
>
> Key: KNOX-736
> URL: https://issues.apache.org/jira/browse/KNOX-736
> Project: Apache Knox
>  Issue Type: Improvement
>Reporter: Henning Kropp
>Assignee: Henning Kropp
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: KNOX-736.diff
>
>
> With support of the NIS schema users in the directory could be created with 
> POSIX attributes. With POSIX attributes LDAP users could be mapped to the OS 
> level for authentication with PAM. This could support testing and 
> demonstrating the PAM authentication in Knox (KNOX-537)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KNOX-719) Knox support for Yarn Resource Manager HA

2017-10-31 Thread Nixon Rodrigues (JIRA)

[ 
https://issues.apache.org/jira/browse/KNOX-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227169#comment-16227169
 ] 

Nixon Rodrigues commented on KNOX-719:
--

[~jeffreyr97], Can you provide documentation to configure the Yarn Resource 
Manager in knox topology for accessing YARN RM UI ?

Thanks
Nixon

> Knox support for Yarn Resource Manager HA
> -
>
> Key: KNOX-719
> URL: https://issues.apache.org/jira/browse/KNOX-719
> Project: Apache Knox
>  Issue Type: Bug
>Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0
>Reporter: Jeffrey E  Rodriguez
>Assignee: Jeffrey E  Rodriguez
> Fix For: 0.12.0
>
> Attachments: KNOX-719-4.patch, KNOX_719_5.patch
>
>
> This would support both REST/UI YARN Resource Manager HA. Based on other HA 
> providers added in Knox.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KNOX-1091) Knox Audit Logging - duplicate correlation ids

2017-10-31 Thread Sandeep More (JIRA)
Sandeep More created KNOX-1091:
--

 Summary: Knox Audit Logging - duplicate correlation ids
 Key: KNOX-1091
 URL: https://issues.apache.org/jira/browse/KNOX-1091
 Project: Apache Knox
  Issue Type: Bug
  Components: Server
Reporter: Kevin Risden
 Fix For: 0.15.0


>From the Knox User list thread: "Multiple topology audit logging", it came to 
>my attention that Knox seems to be logging duplicate correlation ids. 
>Separating out the topic specifically here to dig a bit deeper.

While looking at our Knox audit logs (Knox 0.9 on HDP 2.5) the "correlation id" 
doesn't seem to be unique across requests. Is this to be expected? Here is a 
snippet (anonymized):

grep 7557c91b-2a48-4e09-aefc-44e9892372da /var/knox/gateway-audit.log
 {code}
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEaccess|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE1/ID1//|unavailable|Request
 method: GET
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Groups:
 []
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|unavailable|Request
 method: GET
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|success|Response
 status: 200
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||access|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Response
 status: 200
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEauthentication|principal|USER2|failure|LDAP
 authentication failed.
17/10/10 12:50:09 
||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEaccess|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE2/ID1//|success|Response
 status: 401
{code}

The things to highlight here for the same correlation id:
* different topologies are being used
* different uris are being used
* different users are being used

Some of the things that we have configured that could impact results:
* authentication caching
* multiple Knox servers
* load balancer in front of Knox



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KNOX-1092) There is a carriage return in the hiveserver2 ha documentation

2017-10-31 Thread David Villarreal (JIRA)
David Villarreal created KNOX-1092:
--

 Summary: There is a carriage return in the hiveserver2 ha 
documentation
 Key: KNOX-1092
 URL: https://issues.apache.org/jira/browse/KNOX-1092
 Project: Apache Knox
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: David Villarreal


There is a carriage return in the document before  zookeeperNamespace in the 
hiveserver2 ha docs.



ha
HaProvider
true

HIVE

maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=machine1:2181,machine2:2181,machine3:2181;
   zookeeperNamespace=hiveserver2
   



with this carriage return zookeeperNamespace becomes null in knox causing the 
stacktrace.

Caused by: java.lang.NullPointerException
at java.net.URI$Parser.parse(URI.java:3042)
at java.net.URI.(URI.java:588)
at java.net.URI.create(URI.java:850)
at 
org.apache.hadoop.gateway.ha.provider.impl.DefaultURLManager.markFailed(DefaultURLManager.java:87)
at 
org.apache.hadoop.gateway.ha.provider.impl.DefaultHaProvider.markFailedURL(DefaultHaProvider.java:91)
at 
org.apache.hadoop.gateway.ha.dispatch.DefaultHaDispatch.failoverRequest(DefaultHaDispatch.java:107)
at 
org.apache.hadoop.gateway.ha.dispatch.DefaultHaDispatch.executeRequest(DefaultHaDispatch.java:94)


Please correct the documentation so that the zookeeper line is all together 
without the carriage return.
http://knox.apache.org/books/knox-0-13-0/user-guide.html#HiveServer2+HA



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KNOX-1093) KNOX Not Handling safemode state of one of the NameNode In HA state

2017-10-31 Thread Rajesh Chandramohan (JIRA)
Rajesh Chandramohan created KNOX-1093:
-

 Summary: KNOX Not Handling safemode state of one of the NameNode 
In HA state 
 Key: KNOX-1093
 URL: https://issues.apache.org/jira/browse/KNOX-1093
 Project: Apache Knox
  Issue Type: Bug
  Components: Server
Affects Versions: 0.10.0
Reporter: Rajesh Chandramohan



 per your code WebHdfsHaDispatch.java , When Safemode exception happened it 
calls the retryRequest() method. which also calls executeRequest() method as 
like failover request but the namenode info is not changing for the thread for 
all of its iteration until maxRetryAttempts=300 
and retrySleep=1000 ( 1 sec ) 
After Max 5 minutes , client retries should pick the right namenode atleast in 
next attempt.
 But in this case if we need to copy a set of files in stipulated time there is 
X% os connections falls into these namenode and fails. Can we candle that better

{code:java}
try {
 inboundResponse = executeOutboundRequest(outboundRequest);
 writeOutboundResponse(outboundRequest, inboundRequest, 
outboundResponse, inboundResponse);
  } catch (StandbyException e) {
 LOG.errorReceivedFromStandbyNode(e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (SafeModeException e) {
 LOG.errorReceivedFromSafeModeNode(e);
 retryRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (IOException e) {
 LOG.errorConnectingToServer(outboundRequest.getURI().toString(), e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  }
   }
{code}


Need to change the logic in SafeModeexception state in  KNOX HADispatch code to 
flag the namenode which is stuck in safemode  and maintain don't try queue and 
redirect all further connection only to healthy active namenode . This way X5 
of failures we can handle. What do we think



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KNOX-1093) KNOX Not Handling safemode state of one of the NameNode In HA state

2017-10-31 Thread Rajesh Chandramohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KNOX-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Chandramohan updated KNOX-1093:
--
Description: 
 per your code WebHdfsHaDispatch.java , When Safemode exception happened it 
calls the retryRequest() method. which also calls executeRequest() method as 
like failover request but the namenode info is not changing for the thread for 
all of its iteration until maxRetryAttempts=300 
and retrySleep=1000 ( 1 sec ) 
After Max 5 minutes , client retries should pick the right namenode atleast in 
next attempt.
 But in this case if we need to copy a set of files in stipulated time there is 
X% of connections falls into these namenode and fails. Can we handle that better

{code:java}
try {
 inboundResponse = executeOutboundRequest(outboundRequest);
 writeOutboundResponse(outboundRequest, inboundRequest, 
outboundResponse, inboundResponse);
  } catch (StandbyException e) {
 LOG.errorReceivedFromStandbyNode(e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (SafeModeException e) {
 LOG.errorReceivedFromSafeModeNode(e);
 retryRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (IOException e) {
 LOG.errorConnectingToServer(outboundRequest.getURI().toString(), e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  }
   }
{code}


Need to change the logic in SafeModeexception state in  KNOX HADispatch code to 
flag the namenode which is stuck in safemode  and maintain don't try queue and 
redirect all further connection only to healthy active namenode . This way X5 
of failures we can handle. What do we think

  was:

 per your code WebHdfsHaDispatch.java , When Safemode exception happened it 
calls the retryRequest() method. which also calls executeRequest() method as 
like failover request but the namenode info is not changing for the thread for 
all of its iteration until maxRetryAttempts=300 
and retrySleep=1000 ( 1 sec ) 
After Max 5 minutes , client retries should pick the right namenode atleast in 
next attempt.
 But in this case if we need to copy a set of files in stipulated time there is 
X% os connections falls into these namenode and fails. Can we candle that better

{code:java}
try {
 inboundResponse = executeOutboundRequest(outboundRequest);
 writeOutboundResponse(outboundRequest, inboundRequest, 
outboundResponse, inboundResponse);
  } catch (StandbyException e) {
 LOG.errorReceivedFromStandbyNode(e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (SafeModeException e) {
 LOG.errorReceivedFromSafeModeNode(e);
 retryRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  } catch (IOException e) {
 LOG.errorConnectingToServer(outboundRequest.getURI().toString(), e);
 failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
inboundResponse, e);
  }
   }
{code}


Need to change the logic in SafeModeexception state in  KNOX HADispatch code to 
flag the namenode which is stuck in safemode  and maintain don't try queue and 
redirect all further connection only to healthy active namenode . This way X5 
of failures we can handle. What do we think


> KNOX Not Handling safemode state of one of the NameNode In HA state 
> 
>
> Key: KNOX-1093
> URL: https://issues.apache.org/jira/browse/KNOX-1093
> Project: Apache Knox
>  Issue Type: Bug
>  Components: Server
>Affects Versions: 0.10.0
>Reporter: Rajesh Chandramohan
>
>  per your code WebHdfsHaDispatch.java , When Safemode exception happened it 
> calls the retryRequest() method. which also calls executeRequest() method as 
> like failover request but the namenode info is not changing for the thread 
> for all of its iteration until maxRetryAttempts=300 
> and retrySleep=1000 ( 1 sec ) 
> After Max 5 minutes , client retries should pick the right namenode atleast 
> in next attempt.
>  But in this case if we need to copy a set of files in stipulated time there 
> is X% of connections falls into these namenode and fails. Can we handle that 
> better
> {code:java}
> try {
>  inboundResponse = executeOutboundRequest(outboundRequest);
>  writeOutboundResponse(outboundRequest, inboundRequest, 
> outboundResponse, inboundResponse);
>   } catch (StandbyException e) {
>  LOG.errorReceivedFromStandbyNode(e);
>  failoverRequest(outboundRequest, inboundRequest, outboundResponse, 
> inboundResponse, e);
>   } catch (SafeModeException e) {
>  LOG.errorReceivedFromSafeModeNode(e);
>  retr

[jira] [Commented] (KNOX-1091) Knox Audit Logging - duplicate correlation ids

2017-10-31 Thread Kevin Risden (JIRA)

[ 
https://issues.apache.org/jira/browse/KNOX-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233525#comment-16233525
 ] 

Kevin Risden commented on KNOX-1091:


More context is available in the email thread: 
https://mail-archives.apache.org/mod_mbox/knox-user/201710.mbox/%3CCAJU9nmifQR_D%3D9yVwbXVJ62VKqczZX8a4BedK6Dznwkk%3D1%2BnMw%40mail.gmail.com%3E

> Knox Audit Logging - duplicate correlation ids
> --
>
> Key: KNOX-1091
> URL: https://issues.apache.org/jira/browse/KNOX-1091
> Project: Apache Knox
>  Issue Type: Bug
>  Components: Server
>Reporter: Kevin Risden
>Priority: Major
> Fix For: 0.15.0
>
>
> From the Knox User list thread: "Multiple topology audit logging", it came to 
> my attention that Knox seems to be logging duplicate correlation ids. 
> Separating out the topic specifically here to dig a bit deeper.
> While looking at our Knox audit logs (Knox 0.9 on HDP 2.5) the "correlation 
> id" doesn't seem to be unique across requests. Is this to be expected? Here 
> is a snippet (anonymized):
> grep 7557c91b-2a48-4e09-aefc-44e9892372da /var/knox/gateway-audit.log
>  {code}
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEaccess|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE1/ID1//|unavailable|Request
>  method: GET
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||authentication|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Groups:
>  []
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|unavailable|Request
>  method: GET
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||dispatch|uri|http://WEBHBASE.example.com:8084/NAMESPACE2:TABLE2/multiget?doAs=USER1&row=ID2%2Fd%3Araw|success|Response
>  status: 200
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASE|USER1|||access|uri|/gateway/HADOOPPROD/hbase/NAMESPACE2:TABLE2/multiget?row=ID2%2fd%3araw&|success|Response
>  status: 200
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEauthentication|principal|USER2|failure|LDAP
>  authentication failed.
> 17/10/10 12:50:09 
> ||7557c91b-2a48-4e09-aefc-44e9892372da|audit|WEBHBASEaccess|uri|/gateway/HADOOPTEST/hbase/hbase/NAMESPACE1:TABLE2/ID1//|success|Response
>  status: 401
> {code}
> The things to highlight here for the same correlation id:
> * different topologies are being used
> * different uris are being used
> * different users are being used
> Some of the things that we have configured that could impact results:
> * authentication caching
> * multiple Knox servers
> * load balancer in front of Knox



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KNOX-1094) Knox loses inner exception in IllegalArgumentException issues during AD authentications

2017-10-31 Thread Pravin Bhagade (JIRA)
Pravin Bhagade created KNOX-1094:


 Summary: Knox loses inner exception in IllegalArgumentException 
issues during AD authentications
 Key: KNOX-1094
 URL: https://issues.apache.org/jira/browse/KNOX-1094
 Project: Apache Knox
  Issue Type: Bug
  Components: Server
Affects Versions: 0.12.0
Reporter: Pravin Bhagade
Priority: Normal


Knox to use their Active Directory and noted that when IllegalArgumentException 
exceptions are raised from a specific point in the code, the inner exception is 
lost and make it difficult to diagnose the issue. 

{code:java}
The exception is the one at line 733 of 

https://github.com/hortonworks/knox-release/blob/HDP-2.6.2.17-tag/gateway-provider-security-shiro/src/main/java/org/apache/hadoop/gateway/shirorealm/KnoxLdapRealm.java
 

} catch (NamingException e) { 
throw new IllegalArgumentException("Hit NamingException: " + e.getMessage()); 
{code}

Is it possible to change the code to preserve the inner exception ( set the 
Throwable argument )?





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)