[ 
https://issues.apache.org/jira/browse/YARN-11358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17624125#comment-17624125
 ] 

ASF GitHub Bot commented on YARN-11358:
---------------------------------------

goiri commented on code in PR #5056:
URL: https://github.com/apache/hadoop/pull/5056#discussion_r1005133936


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java:
##########
@@ -2024,9 +2030,15 @@ private <R> Map<SubClusterInfo, R> 
invokeConcurrent(Collection<SubClusterInfo> c
     Map<SubClusterInfo, R> results = new HashMap<>();
 
     // Send the requests in parallel
-    CompletionService<R> compSvc = new 
ExecutorCompletionService<>(this.threadpool);
+    CompletionService<Pair<R, Exception>> compSvc =

Review Comment:
   Single line?



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml:
##########
@@ -5056,4 +5056,18 @@
     </description>
   </property>
 
+  <property>
+    <name>yarn.router.interceptor.allow-partial-result.enable</name>
+    <value>true</value>

Review Comment:
   In the RBF PR there was controversy because the default behavior should be 
to now allow this.
   I guess we can fix this in this PR.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java:
##########
@@ -2036,27 +2048,34 @@ private <R> Map<SubClusterInfo, R> 
invokeConcurrent(Collection<SubClusterInfo> c
               getMethod(request.getMethodName(), request.getTypes());
           Object retObj = method.invoke(interceptor, request.getParams());
           R ret = clazz.cast(retObj);
-          return ret;
+          return Pair.of(ret, null);
         } catch (Exception e) {
           LOG.error("SubCluster {} failed to call {} method.",
               info.getSubClusterId(), request.getMethodName(), e);
-          return null;
+          return Pair.of(null, e);
         }
       });
     }
 
     clusterIds.stream().forEach(clusterId -> {
       try {
-        Future<R> future = compSvc.take();
-        R response = future.get();
+        Future<Pair<R, Exception>> future = compSvc.take();
+        Pair<R, Exception> pair = future.get();
+        R response = pair.getKey();
         if (response != null) {
           results.put(clusterId, response);
         }
+        Exception exception = pair.getRight();

Review Comment:
   In the API seems to be getValue()?



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/TestableFederationInterceptorREST.java:
##########
@@ -50,5 +50,4 @@ protected void registerBadSubCluster(SubClusterId badSC) {
             badSC);
     interceptor.setRunning(false);
   }
-

Review Comment:
   Avoid



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/TestFederationInterceptorRESTRetry.java:
##########
@@ -80,10 +83,16 @@
   @Override
   public void setUp() {
     super.setUpConfig();
+
+    Configuration conf = this.getConf();
+    // Enable strict mode
+    
conf.setBoolean(YarnConfiguration.ROUTER_INTERCEPTOR_ALLOW_PARTIAL_RESULT_ENABLED,

Review Comment:
   Single line





> [Federation] Add FederationInterceptor#allow-partial-result config.
> -------------------------------------------------------------------
>
>                 Key: YARN-11358
>                 URL: https://issues.apache.org/jira/browse/YARN-11358
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: federation, router
>    Affects Versions: 3.4.0
>            Reporter: fanshilun
>            Assignee: fanshilun
>            Priority: Major
>              Labels: pull-request-available
>
> When we use Federation, we will have multiple sub-clusters, and some 
> interfaces need to call all clusters to get results, 
> such as getNodes, getNodeToLabels, etc.
> But if the sub-cluster is unavailable when we call the interface, should the 
> interface return the result?
> The current code is to return the result.If the result is returned, the 
> result information is inaccurate; 
> if the result is not returned, it is not user-friendly
> We'd better add a parameter control, the user chooses whether to report an 
> error or return the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to