[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-05-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r872305484


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -45,10 +56,25 @@ public class KerberosDelegationTokenManager implements 
DelegationTokenManager {
 
 private final Configuration configuration;
 
+private final KerberosRenewalPossibleProvider 
kerberosRenewalPossibleProvider;
+
 @VisibleForTesting final Map 
delegationTokenProviders;
 
-public KerberosDelegationTokenManager(Configuration configuration) {
+private final ScheduledExecutor scheduledExecutor;
+
+private final ExecutorService ioExecutor;

Review Comment:
   Added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-05-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r872223118


##
pom.xml:
##
@@ -125,7 +125,7 @@ under the License.
4.13.2
5.8.1
0.22.0
-   2.21.0
+   3.4.6

Review Comment:
   This is just a POC and open for discussion. Please see 
[this](https://github.com/apache/flink/pull/19372#discussion_r849691193) 
comment where I've considered all of the possible solutions which I'm aware of. 
As a result static method mocking seems like the least painful solution if we 
want to mock `UGI`.
   
   All in all if we decide not to do that I can remove the extra commit which 
contains the Mockito version upgrade. As a consequence either we drop 
`testStartTGTRenewalShouldScheduleRenewal` or I would like to hear a different 
approach to mock `UGI`. If we decide to go on w/ this then I agree this must be 
a separate jira.
   
   I'm basically fine w/ either approaches though my personal opinion is that 
in this exceptional case static function mocking would be the least painful 
solution with super compact code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-05-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r872215096


##
flink-connectors/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumerTest.java:
##
@@ -97,13 +98,14 @@
 import static org.junit.Assert.assertThat;
 import static org.junit.Assert.assertTrue;
 import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.mockStatic;
 import static org.mockito.Mockito.never;
 import static org.mockito.Mockito.spy;
 import static org.mockito.Mockito.when;
 
 /** Suite of FlinkKinesisConsumer tests for the methods called throughout the 
source life cycle. */
 @RunWith(PowerMockRunner.class)
-@PrepareForTest({FlinkKinesisConsumer.class, KinesisConfigUtil.class})
+@PrepareForTest(FlinkKinesisConsumer.class)
 public class FlinkKinesisConsumerTest extends TestLogger {

Review Comment:
   This is Mockito version change related which we may or may not do in a 
separate jira.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-21 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r855434579


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   Seems like fixed all the issues and all the tests passed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-21 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r855078086


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   The first try failed w/ mockito 3.4.0 because I ran into the 
[following](https://github.com/mockito/mockito/commit/12ba5936b736a2886220f9ae4a6492558dba4e14)
 issue.
   Upgraded to the latest patch version of mockito (3.4.6) to have all such 
fixes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-21 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r854939573


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   In the meantime I've had a deeper look at the mock framework versions and 
[here](https://gist.github.com/gaborgsomogyi/4e8b120cbebde6d2a6903e5fcccbbaff) 
is the extract.
   
   I've added a mockito version upgrade based unit test which we may or may not 
agree on.
   This single test works but not sure about all other tests, waiting on 
jenkins to show it to us.
   If we say this is not the direction or all other tests are not working then 
I can roll it back easily.
   As a general note from my side. Even if we would choose this solution I 
think it would be good to split it up to 3 PRs:
   * This PR w/o startTGTRenewal test
   * Upgrade mockito version
   * Add startTGTRenewal test
   
   If this not works or you disagree plz suggest a way.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-20 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r854030684


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   @dmvk did you have a chance to consider the situation in-depth? I've my own 
suggestion which may or may not intersect your opinion. Namely if it's not too 
horror complex upgrading to mockito 3.4.0 (where static function mocking is 
introduced) and choosing bullet point 1 is my preference. Though no idea why 
mockito is super old, maybe there was no agreement to upgrade that?! If you 
have some insights please share.
   
   If that would be an overkill then I would vote on bullet point 3 because all 
the other options would add hard to maintain and brittle solutions.
   
   WDYT?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-20 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r854030684


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   @dmvk did you have a chance to consider the situation in-depth? I've my own 
suggestion which may or may not intersect your opinion. Namely if it's not too 
horror complex upgrading to mockito 3.4.0 and choosing bullet point 1 is my 
preference. Though no idea why mockito is super old, maybe there was no 
agreement to upgrade that?! If you have some insights please share.
   
   If that would be an overkill then I would vote on bullet point 3 because all 
the other options would add hard to maintain and brittle solutions.
   
   WDYT?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-14 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849691193


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   I've had a deeper look and let's summarize my findings:
   * No mocking framework allowed
   * `UserGroupInformation` class has no public constructor so instance 
creation is possible with reflection only which I'm pretty sure won't 
initialize the instance properly. As a result all places where 
`UserGroupInformation` is used need to be hacked around.
   * If we go to the `KerberosClient` direction we would see functions like 
`KerberosClient.hasCurrentUserKerberosCredentials()` instead of 
`UserGroupInformation.getCurrentUser().hasKerberosCredentials()` which is hacky 
but doable
   * But as soon as the condition gets complicated like this for example: 
`Option(currentUser.getRealUser()).getOrElse(currentUser).hasKerberosCredentials()`
 how should be the `KerberosClient` funtion be named?
   
   In Spark and other components I've tried to mock/reimplement/modify/make 
`UserGroupInformation` testable w/o any success. I think we have the same 
situation here unless you have a clear doable suggestion.
   
   I think realistically we have the following possibilities for this case:
   * Mock `UserGroupInformation.getCurrentUser()` static function and we give 
back a mocked `UserGroupInformation` instance -> Here powermock runner with 
junit5 is simply not working and mockito is too old to mock static functions. 
All in all here only the mockito version upgrade could be a potential solution.
   * Use reflection to call `UserGroupInformation` hidden constructor -> here I 
have no idea what will happen, I mean how well initialized the instance will be 
+ how to modify the instance behavior to give back something hardcoded
   * We don't write automated tests for places where `UserGroupInformation` is 
embedded
   * We introduce `KerberosClient` and we create functions like 
`currentUserRealUserOrElseCurrentUserHasKerberosCredentials()` from expressions 
like 
`Option(currentUser.getRealUser()).getOrElse(currentUser).hasKerberosCredentials()`
   
   Well, none of the proposals looks good but here in Flink I've not found the 
holy grail just like in other places.
   In Spark bullet point 3 has been implemented which is definitely debatable. 
The fact is that there with powermock bullet point 1 would be possible but 
nobody ever done that.
   
   There is a reason why 
[such](https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/ugi.html)
 writings are created:
   `If there is one class guaranteed to strike fear into anyone with experience 
in Hadoop+Kerberos code it is UserGroupInformation, abbreviated to "UGI"`
   
   Let's hear your opinion.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to 

[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849691193


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   I've had a deeper look and let's summarize my findings:
   * No mocking framework allowed
   * `UserGroupInformation` class has no public constructor so instance 
creation is possible with reflection only which I'm pretty sure won't 
initialize the instance properly. As a result all places where 
`UserGroupInformation` is used need to be hacked around.
   * If we go to the `KerberosClient` direction we would see functions like 
`KerberosClient.hasCurrentUserKerberosCredentials()` instead of 
`UserGroupInformation.getCurrentUser().hasKerberosCredentials()` which is hacky 
but doable
   * But as soon as the condition gets complicated like this for example: 
`Option(currentUser.getRealUser()).getOrElse(currentUser).hasKerberosCredentials()`
 how should be the `KerberosClient` be named?
   
   In Spark and other components I've tried to mock/reimplement/modify/make 
`UserGroupInformation` testable w/o any success. I think we have the same 
situation here unless you have a clear doable suggestion.
   
   I think realistically we have the following possibilities for this case:
   * Mock `UserGroupInformation.getCurrentUser()` static function and we give 
back a mocked `UserGroupInformation` instance -> Here powermock runner with 
junit5 is simply not working and mockito is too old to mock static functions. 
All in all here only the mockito version upgrade could be a potential solution.
   * Use reflection to call `UserGroupInformation` hidden constructor -> here I 
have no idea what will happen, I mean how well initialized the instance will be 
+ how to modify the instance behavior to give back something hardcoded
   * We don't write automated tests for places where `UserGroupInformation` is 
embedded
   * We introduce `KerberosClient` and we create functions like 
`currentUserRealUserOrElseCurrentUserHasKerberosCredentials()` from expressions 
like 
`Option(currentUser.getRealUser()).getOrElse(currentUser).hasKerberosCredentials()`
   
   Well, none of the proposals looks good but here in Flink I've not found the 
holy grail just like in other places.
   In Spark bullet point 3 has been implemented which is definitely debatable. 
The fact is that there with powermock bullet point 1 would be possible but 
nobody ever done that.
   
   There is a reason why 
[such](https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/ugi.html)
 writings are created:
   `If there is one class guaranteed to strike fear into anyone with experience 
in Hadoop+Kerberos code it is UserGroupInformation, abbreviated to "UGI"`
   
   Let's hear your opinion.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the 

[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849591137


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();

Review Comment:
   I've started to have a look...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849590001


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();
+LOG.debug("TGT renewed 
successfully");
+} catch (Exception e) {
+LOG.error("Error while 
renewing TGT", e);
+}
+}),
+tgtRenewalPeriod,
+tgtRenewalPeriod,
+TimeUnit.MILLISECONDS);
+LOG.debug("Credential renewal task started and reoccur in {} ms", 
tgtRenewalPeriod);

Review Comment:
   Changed to TGT all the places. In Kerberos there are hundreds of cloudy 
things and adding them to classes and/or functions doesn't help. My intention 
is to add a complete doc like 
[this](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/security/README.md)
 which includes every conceptual/architectural explanation. Here we update 
Ticket Granting Ticket and there is a description why needed:
   ```
   // In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
   // 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
   // in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
   // that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
   // the TGT does not need to be renewed yet.
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849583201


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -45,10 +56,28 @@ public class KerberosDelegationTokenManager implements 
DelegationTokenManager {
 
 private final Configuration configuration;
 
+private final SecurityConfiguration securityConfiguration;

Review Comment:
   It will be needed in later PRs but for now it can be dropped.



##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +139,62 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkNotNull(scheduledExecutor, "Scheduled executor must not be null");
+checkNotNull(executorService, "Executor service must not be null");
+checkState(tgtRenewalFuture == null, "Manager is already started");
+
+if (!kerberosRenewalPossibleProvider.isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+startTGTRenewal();
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+long tgtRenewalPeriod = 
configuration.get(KERBEROS_RELOGIN_PERIOD).toMillis();
+tgtRenewalFuture =
+scheduledExecutor.scheduleAtFixedRate(
+() ->
+executorService.execute(
+() -> {
+try {
+LOG.debug("Renewing TGT");
+
currentUser.checkTGTAndReloginFromKeytab();
+LOG.debug("TGT renewed 
successfully");
+} catch (Exception e) {
+LOG.error("Error while 
renewing TGT", e);

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849582652


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -45,10 +56,28 @@ public class KerberosDelegationTokenManager implements 
DelegationTokenManager {
 
 private final Configuration configuration;
 
+private final SecurityConfiguration securityConfiguration;
+
+private final KerberosRenewalPossibleProvider 
kerberosRenewalPossibleProvider;
+
 @VisibleForTesting final Map 
delegationTokenProviders;
 
-public KerberosDelegationTokenManager(Configuration configuration) {
+private final ScheduledExecutor scheduledExecutor;
+
+private final ExecutorService executorService;
+
+private ScheduledFuture tgtRenewalFuture;
+
+public KerberosDelegationTokenManager(
+Configuration configuration,
+@Nullable ScheduledExecutor scheduledExecutor,
+@Nullable ExecutorService executorService) {

Review Comment:
   There are basically 2 use-cases for DT creation:
   * Single obtain
   * DTM start where auto obtain + propagation happens in a re-occuring way
   
   These 2 use-cases require different internals. So all in all both belong to 
the interface.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849579153


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -45,10 +56,28 @@ public class KerberosDelegationTokenManager implements 
DelegationTokenManager {
 
 private final Configuration configuration;
 
+private final SecurityConfiguration securityConfiguration;
+
+private final KerberosRenewalPossibleProvider 
kerberosRenewalPossibleProvider;
+
 @VisibleForTesting final Map 
delegationTokenProviders;
 
-public KerberosDelegationTokenManager(Configuration configuration) {
+private final ScheduledExecutor scheduledExecutor;
+
+private final ExecutorService executorService;

Review Comment:
   Changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849578739


##
flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java:
##
@@ -424,12 +424,21 @@ public void start() throws Exception {
 
 heartbeatServices = 
HeartbeatServices.fromConfiguration(configuration);
 
-delegationTokenManager =
-
configuration.getBoolean(SecurityOptions.KERBEROS_FETCH_DELEGATION_TOKEN)
-&& 
HadoopDependency.isHadoopCommonOnClasspath(
-getClass().getClassLoader())
-? new 
KerberosDelegationTokenManager(configuration)
-: new NoOpDelegationTokenManager();
+if 
(configuration.getBoolean(SecurityOptions.KERBEROS_FETCH_DELEGATION_TOKEN)) {
+if 
(HadoopDependency.isHadoopCommonOnClasspath(getClass().getClassLoader())) {
+delegationTokenManager =
+new KerberosDelegationTokenManager(
+configuration,
+
commonRpcService.getScheduledExecutor(),
+ioExecutor);
+} else {
+LOG.info(
+"Cannot use kerberos delegation token manager 
because Hadoop cannot be found in the Classpath.");
+delegationTokenManager = new 
NoOpDelegationTokenManager();
+}
+} else {
+delegationTokenManager = new NoOpDelegationTokenManager();
+}

Review Comment:
   `KerberosDelegationTokenManagerFactory` added.



##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -45,10 +56,28 @@ public class KerberosDelegationTokenManager implements 
DelegationTokenManager {
 
 private final Configuration configuration;
 
+private final SecurityConfiguration securityConfiguration;
+
+private final KerberosRenewalPossibleProvider 
kerberosRenewalPossibleProvider;
+
 @VisibleForTesting final Map 
delegationTokenProviders;
 
-public KerberosDelegationTokenManager(Configuration configuration) {
+private final ScheduledExecutor scheduledExecutor;
+
+private final ExecutorService executorService;
+
+private ScheduledFuture tgtRenewalFuture;

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849363213


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   I've tested it on cluster and works like charm so closing this discussion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-13 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r849249180


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   That fully makes sense, `commonRpcService` also available in the 
`MiniCluster` area. Making the changes right now...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-12 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847507611


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   In the latest code `ComponentMainThreadExecutor` is null. When I know from 
where it should come from it will be filled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-12 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r848157811


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   After I've had a deep look I've found the following obstacles which blocks 
us to use the mentioned executors:
   * If you mean `ResourceManager.getMainThreadExecutor` to use in the first 
bullet point then that's a `MainThreadExecutor` instance which doesn't support 
`scheduleAtFixedRate`, please see 
[here](https://github.com/apache/flink/blob/4034d3cd6d13e88e2e5ca101510bf333e94a53fa/flink-rpc/flink-rpc-core/src/main/java/org/apache/flink/runtime/rpc/RpcEndpoint.java#L537).
 We can implement that from scratch but touching core threading can be super 
dangerous, though I'm not against it to add such functionality.
   * If we would rely on `ResourceManager.getMainThreadExecutor` then we must 
modify the interface (`start` function) and we need to put implementation 
details into the interface (the 2 executors) which would frustrate me a bit. I 
think if we want to do dependency injection then the proper place would be the 
constructor which is not touching the interface. Here if we decide to modify 
the interface I can accept that but I would think the implementation would be 
less clean.
   * If we would add the new `scheduleAtFixedRate` functionality then 
`ResourceManager` usage is covered but in 
[MiniCluster](https://github.com/apache/flink/blob/f089a99ce73ad15531a4d1b72899d8367fab662a/flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java#L431)
 we must provide something too but there no such executor exists (at least I've 
not found anything) so we need to come up w/ a new executor anyway.
   
   All in all I've left the original code there for now until we make an 
agreement.
   
   I understand the direction not to pollute w/ threads but in this case the 
return of investment is low because in some cases new executor is needed but 
the code would be more complex.
   
   Would like to hear your voice on this.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-12 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r848157811


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   After I've had a deep look I've found the following obstacles which blocks 
us to use the mentioned executors:
   * If you mean `ResourceManager.getMainThreadExecutor` to use in the first 
bullet point then that's a `MainThreadExecutor` instance which doesn't support 
`scheduleAtFixedRate`, please see 
[here](https://github.com/apache/flink/blob/4034d3cd6d13e88e2e5ca101510bf333e94a53fa/flink-rpc/flink-rpc-core/src/main/java/org/apache/flink/runtime/rpc/RpcEndpoint.java#L537).
 We can implement that from scratch but touching core threading can be super 
dangerous, though I'm not against it to add such functionality.
   * If we would rely on `ResourceManager.getMainThreadExecutor` then we must 
modify the interface (`start` function) and we need to put implementation 
details into the interface (the 2 executors) which would frustrate me a bit. I 
think if we want to do dependency injection then the proper place would be the 
constructor which is not touching the interface. Here if we decide to modify 
the interface I can accept that but I would think the implementation would be 
less clean.
   * If we would add the new `scheduleAtFixedRate` functionality then 
`ResourceManager` usage is covered but in 
[MiniCluster](https://github.com/apache/flink/blob/f089a99ce73ad15531a4d1b72899d8367fab662a/flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java#L431)
 we must provide something too but there no such executor exists (at least I've 
not found anything) so we need to come up w/ a new executor anyway.
   
   All in all I've left the original code there for now until we make an 
agreement.
   Would like to hear your voice on this.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847507611


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   In the latest code `ComponentMainThreadExecutor` is null. When I know from 
where it should come from it will be filled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847504606


##
flink-runtime/src/test/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManagerTest.java:
##
@@ -20,13 +20,27 @@
 
 import org.apache.flink.configuration.Configuration;
 
+import org.apache.hadoop.security.UserGroupInformation;
 import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.powermock.api.mockito.PowerMockito;
+import org.powermock.core.classloader.annotations.PrepareForTest;
+import org.powermock.modules.junit4.PowerMockRunner;

Review Comment:
   I basically not agree that mocking is bad in general but this case is 
special.
   Junit5 is not supporting powermock so no other possibility than function 
override.
   Yeah, there is a possibility to add an interface but I felt it would be an 
overkill.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847414464


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =

Review Comment:
   Since we plan to use existing executors this code will be deleted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847413385


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);

Review Comment:
   I'm fine w/ using the already existing executors but I don't see any 
`ComponentMainThreadExecutor` in `ResourceManager`. Can you point where can I 
find it to pass to DTM?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847376435


##
flink-runtime/src/test/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManagerTest.java:
##
@@ -79,4 +93,48 @@ public void testAllProvidersLoaded() {
 assertTrue(ExceptionThrowingDelegationTokenProvider.constructed);
 assertFalse(delegationTokenManager.isProviderLoaded("throw"));
 }
+
+@Test
+public void isRenewalPossibleMustGiveBackFalseByDefault() throws 
IOException {
+UserGroupInformation ugi = 
PowerMockito.mock(UserGroupInformation.class);
+PowerMockito.mockStatic(UserGroupInformation.class);
+when(UserGroupInformation.getCurrentUser()).thenReturn(ugi);
+
+ExceptionThrowingDelegationTokenProvider.enabled = false;
+Configuration configuration = new Configuration();
+KerberosDelegationTokenManager delegationTokenManager =
+new KerberosDelegationTokenManager(configuration);
+
+assertFalse(delegationTokenManager.isRenewalPossible());
+}
+
+@Test
+public void isRenewalPossibleMustGiveBackTrueWhenKeytab() throws 
IOException {

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847315707


##
flink-runtime/src/test/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManagerTest.java:
##
@@ -20,13 +20,27 @@
 
 import org.apache.flink.configuration.Configuration;
 
+import org.apache.hadoop.security.UserGroupInformation;
 import org.junit.Test;

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on a diff in pull request #19372: [FLINK-26043][runtime][security] Add periodic kerberos relogin to KerberosDelegationTokenManager

2022-04-11 Thread GitBox


gaborgsomogyi commented on code in PR #19372:
URL: https://github.com/apache/flink/pull/19372#discussion_r847297674


##
flink-runtime/src/main/java/org/apache/flink/runtime/security/token/KerberosDelegationTokenManager.java:
##
@@ -110,13 +126,84 @@ public void obtainDelegationTokens(Credentials 
credentials) {
  * task managers.
  */
 @Override
-public void start() {
-LOG.info("Starting renewal task");
+public void start() throws Exception {
+checkState(renewalExecutor == null, "Manager is already started");
+
+if (!isRenewalPossible()) {
+LOG.info("Renewal is NOT possible, skipping to start renewal 
task");
+return;
+}
+
+ThreadFactory threadFactory =
+new ThreadFactoryBuilder()
+.setDaemon(true)
+.setNameFormat("Credential Renewal Thread")
+.build();
+renewalExecutor = new ScheduledThreadPoolExecutor(1, threadFactory);
+// By default, a cancelled task is not automatically removed from the 
work queue until its
+// delay elapses. We have to enable it manually.
+renewalExecutor.setRemoveOnCancelPolicy(true);
+
+startTGTRenewal();
+}
+
+@VisibleForTesting
+boolean isRenewalPossible() throws IOException {
+if (!StringUtils.isBlank(securityConfiguration.getKeytab())
+&& !StringUtils.isBlank(securityConfiguration.getPrincipal())) 
{
+LOG.debug("Login from keytab is possible");
+return true;
+}
+LOG.debug("Login from keytab is NOT possible");
+
+if (securityConfiguration.useTicketCache()
+&& 
UserGroupInformation.getCurrentUser().hasKerberosCredentials()) {
+LOG.debug("Login from ticket cache is possible");
+return true;
+}
+LOG.debug("Login from ticket cache is NOT possible");
+
+return false;
+}
+
+private void startTGTRenewal() throws IOException {
+LOG.debug("Starting credential renewal task");
+
+UserGroupInformation currentUser = 
UserGroupInformation.getCurrentUser();
+if (currentUser.isFromKeytab()) {
+// In Hadoop 2.x, renewal of the keytab-based login seems to be 
automatic, but in Hadoop
+// 3.x, it is configurable (see 
hadoop.kerberos.keytab.login.autorenewal.enabled, added
+// in HADOOP-9567). This task will make sure that the user stays 
logged in regardless of
+// that configuration's value. Note that 
checkTGTAndReloginFromKeytab() is a no-op if
+// the TGT does not need to be renewed yet.
+Runnable tgtRenewalTask =
+() -> {
+try {
+LOG.debug("Renewing TGT");
+currentUser.checkTGTAndReloginFromKeytab();
+LOG.debug("TGT renewed successfully");
+} catch (Exception e) {
+LOG.error("Error while renewing TGT", e);

Review Comment:
   Since we retry it makes sense so changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org