shameersss1 commented on a change in pull request #60:
URL: https://github.com/apache/tez/pull/60#discussion_r821868285



##########
File path: tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java
##########
@@ -883,6 +883,25 @@ public TezConfiguration(boolean loadDefaults) {
       + "dag.cleanup.on.completion";
   public static final boolean TEZ_AM_DAG_CLEANUP_ON_COMPLETION_DEFAULT = false;
 
+  /**
+   * Boolean value. Instructs AM to delete vertex shuffle data if a vertex and 
all its
+   * child vertices at a certain depth are completed.
+   */
+  @ConfigurationScope(Scope.AM)
+  @ConfigurationProperty(type="boolean")
+  public static final String TEZ_AM_VERTEX_CLEANUP_ON_COMPLETION = 
TEZ_AM_PREFIX
+          + "vertex.cleanup.on.completion";
+  public static final boolean TEZ_AM_VERTEX_CLEANUP_ON_COMPLETION_DEFAULT = 
false;
+
+  /**
+   * Int value. The height from the vertex that it can issue shuffle data 
deletion upon completion
+   */
+  @ConfigurationScope(Scope.AM)
+  @ConfigurationProperty(type="integer")
+  public static final String TEZ_AM_VERTEX_CLEANUP_HEIGHT = TEZ_AM_PREFIX
+          + "vertex.cleanup.height";
+  public static final int TEZ_AM_VERTEX_CLEANUP_HEIGHT_DEFAULT = 1;

Review comment:
       Just to add to my previous comment, The intention behind having height 
as a config is that (rather than value being 1 is that), The probability of 
shuffle data of great ancestor(s) being requested is very rare and happens when 
multiple shuffle nodes are lost. So having control over which ancestor(s) 
shuffle data to clear up will give user more flexibility to control




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to