abstractdog commented on code in PR #5174:
URL: https://github.com/apache/hive/pull/5174#discussion_r1551557693


##########
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##########
@@ -4756,6 +4756,10 @@ public static enum ConfVars {
         "Class to use for calculating available slots during split 
generation"),
     
HIVE_TEZ_GENERATE_CONSISTENT_SPLITS("hive.tez.input.generate.consistent.splits",
 true,
         "Whether to generate consistent split locations when generating splits 
in the AM"),
+    
HIVE_TEZ_SPLIT_FS_SERIALIZATION_THRESHOLD("hive.tez.split.fs.serialization.threshold",
 524288,

Review Comment:
   right, I was thinking about the same
   
   in case of a global threshold, we'll serialize every split to fs once we 
reach that regardless of the size, so we might end up serializing smaller 
splits also (20-30Kb), but that's fine as we remain stable...we need to figure 
out a reasonable default, e.g. 256MB?
   
   -1: disabled
   0: serialize everything to fs
   n: global threshold



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to