[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511824#comment-16511824
 ] 

ASF GitHub Bot commented on TEZ-3904:
-

Github user bjmb commented on a diff in the pull request:

https://github.com/apache/tez/pull/22#discussion_r195270901
  
--- Diff: tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java 
---
@@ -438,6 +438,26 @@ static Credentials setupDAGCredentials(DAG dag, 
Credentials sessionCredentials,
 return dagCredentials;
   }
 
+  @Private
+  @VisibleForTesting
+  public static Credentials createAMCredentials(ApplicationId appId,
--- End diff --

I pulled this out because I was going to reuse but will leave it back in if 
I don't


> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511821#comment-16511821
 ] 

ASF GitHub Bot commented on TEZ-3904:
-

Github user bjmb commented on a diff in the pull request:

https://github.com/apache/tez/pull/22#discussion_r195270455
  
--- Diff: 
tez-runtime-internals/src/main/java/org/apache/tez/runtime/task/TezTaskRunner2.java
 ---
@@ -132,7 +135,7 @@ public TezTaskRunner2(Configuration tezConf, 
UserGroupInformation ugi, String[]
 ObjectRegistry objectRegistry, String pid,
 ExecutionContext executionContext, long 
memAvailable,
 boolean updateSysCounters, HadoopShim hadoopShim,
-TezExecutors sharedExecutor) throws
+TezExecutors sharedExecutor, SystemEventHandler 
systemEventHandler) throws
--- End diff --

Same with this API


> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511817#comment-16511817
 ] 

ASF GitHub Bot commented on TEZ-3904:
-

GitHub user bjmb opened a pull request:

https://github.com/apache/tez/pull/22

[WIP][TEZ-3904] An API to update tokens for Tez AM and the DAG

This PR has ended up bigger than what I expected so I wanted to ask for 
some feedback before going forward. This is what is does mainly:
* Adds `updateAMCredentials` and `updateDAGCredentials` to `TezClient`. 
This is not the final API but this functions will be useful anyway to send the 
credentials
* Add two corresponding functions to the DAGClientAMProtocol
* Add a `SystemEventHandler` in the runtime internals to process the 
`UpdateCredentialsEvent`s
* Add the `UpdateCredentialsEvent` event and some transitions to the 
`TaskImpl`'s state machine that represent the credentials being updated.

TODO:
* Add more tests. Will do after the feedback
* Log to history the credentials change?
* I've added some logic to renew the session credentials, these credentials 
should be updated as well?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bjmb/tez TEZ-3904

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tez/pull/22.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22


commit 6b50f27fda0c1a8698f0aed0f5e63f5f8413ba78
Author: Jaume Marhuenda 
Date:   2018-06-08T18:12:44Z

[TEZ-3904] An API to update tokens for Tez AM and the DAG




> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511820#comment-16511820
 ] 

ASF GitHub Bot commented on TEZ-3904:
-

Github user bjmb commented on a diff in the pull request:

https://github.com/apache/tez/pull/22#discussion_r195270418
  
--- Diff: 
tez-runtime-internals/src/main/java/org/apache/tez/runtime/LogicalIOProcessorRuntimeTask.java
 ---
@@ -160,14 +161,15 @@
   private final boolean initializeProcessorFirst;
   private final boolean initializeProcessorIOSerially;
   private final TezExecutors sharedExecutor;
+  private final SystemEventHandler systemEventHandler;
 
   public LogicalIOProcessorRuntimeTask(TaskSpec taskSpec, int 
appAttemptNumber,
   Configuration tezConf, String[] localDirs, TezUmbilical tezUmbilical,
   Map serviceConsumerMetadata, Map 
envMap,
   Multimap startedInputsMap, ObjectRegistry 
objectRegistry,
   String pid, ExecutionContext ExecutionContext, long memAvailable,
   boolean updateSysCounters, HadoopShim hadoopShim,
-  TezExecutors sharedExecutor) throws IOException {
+  TezExecutors sharedExecutor, SystemEventHandler systemEventHandler) 
throws IOException {
--- End diff --

This API changes, this is bad?


> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-08 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506317#comment-16506317
 ] 

Sergey Shelukhin commented on TEZ-3904:
---

Yeah that's the idea, although in the case of Tez the actual renewer may not 
live fully in Tez (since some tokens, like the ones for HBase/etc., are 
originally obtained by Hive, and some are obtained by Tez based on paths). 
Might make sense to allow the users of Tez (ie Hive) to supply a function to 
get the former instead of the tokens themselves, so Tez could get the new 
tokens at any time.
The containers would also need to get tokens.

> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-06-07 Thread Jaume M (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505201#comment-16505201
 ] 

Jaume M commented on TEZ-3904:
--

Spark seems to [push from the 
driver|https://github.com/apache/spark/blob/e76b0124fbe463def00b1dffcfd8fd47e04772fe/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/AMCredentialRenewer.scala#L37]
 new delegation tokens when they are close to expiring. [~sershe] the 
containers started by Tez where the DAG is running would also have to get new 
credentials?

> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-05-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484426#comment-16484426
 ] 

Sergey Shelukhin commented on TEZ-3904:
---

Not that I know of. However, MR is deprecated in Hive, so we basically only 
care about Tez and Spark. Not sure what Spark does for that, if it uses 
delegation tokens at all.

> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3904) an API to update tokens for Tez AM and the DAG

2018-05-22 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483952#comment-16483952
 ] 

Jonathan Eagles commented on TEZ-3904:
--

[~sershe], does MR or any other runtime framework allow for updating tokens via 
a separate API?

> an API to update tokens for Tez AM and the DAG
> --
>
> Key: TEZ-3904
> URL: https://issues.apache.org/jira/browse/TEZ-3904
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Nothing is permanent in this world, lest of all delegation tokens.
> The current way around token expiration (the one where you cannot keep 
> renewing anymore) in Hive when Tez AM is used in session mode is to cycle Tez 
> AM. It may happen though that a query is running at that time, and so the AM 
> cannot be restarted with new tokens. We let the query run its course and it 
> usually dies because it tries to do something with an expired token.
> To get around that, we cycle AMs a few hours before tokens are going to 
> expire.
> However, that is still not ideal because it puts an upper bound on safe Hive 
> query runtime (a query longer than 3 hours with current config may fail due 
> to an expired token if its timing is unlucky), and also precludes setting 
> tokens to expire much faster than the standard 7-day time frame.
> There should be a mechanism to replace tokens in the AM, including for a 
> running DAG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)