[GitHub] [hive] bwzheng2010 commented on a diff in pull request #4738: HIVE-25615: Fix Hive on tez will generate at least one MapContainer per 0 length file

via GitHub Mon, 25 Sep 2023 00:34:31 -0700


bwzheng2010 commented on code in PR #4738:
URL: https://github.com/apache/hive/pull/4738#discussion_r1335494025



##########
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ColumnarSplitSizeEstimator.java:
##########
@@ -35,6 +35,9 @@ public class ColumnarSplitSizeEstimator implements 
SplitSizeEstimator {
   @Override
   public long getEstimatedSize(InputSplit inputSplit) throws IOException {
     long colProjSize = inputSplit.getLength();
+    if (colProjSize == 0) {

Review Comment:
   Hi，thanks for review the code
   
   The reason for judging the split length is 0 at the beginning is to keep the 
original logic.
   
   I think，the original logic is that if the columnar projection size or the 
inner split has 0 bytes, it does not mean that the length of this split is 0. 
Returning Integer.MAX_VALUE is a safer method. 



##########
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ColumnarSplitSizeEstimator.java:
##########
@@ -35,6 +35,9 @@ public class ColumnarSplitSizeEstimator implements 
SplitSizeEstimator {
   @Override
   public long getEstimatedSize(InputSplit inputSplit) throws IOException {
     long colProjSize = inputSplit.getLength();
+    if (colProjSize == 0) {

Review Comment:
   Hi，thanks for review the code
   
   The reason for judging the split length is 0 at the beginning is to keep the 
original logic.
   
   I think，the original logic is that if the columnar projection size or the 
inner split has 0 bytes, it does not mean that the length of this split is 0. 
Returning Integer.MAX_VALUE is a safer method. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] bwzheng2010 commented on a diff in pull request #4738: HIVE-25615: Fix Hive on tez will generate at least one MapContainer per 0 length file

Reply via email to