ayushtkn commented on code in PR #3606:
URL: https://github.com/apache/hive/pull/3606#discussion_r979098870


##########
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java:
##########
@@ -384,7 +384,6 @@ public void parse(Configuration conf, Path rootPath) throws 
IOException {
         
OrcRawRecordMerger.TransactionMetaData.findWriteIDForSynthetcRowIDs(getPath(), 
rootPath, conf);
     writeId = tmd.syntheticWriteId;
     stmtId = tmd.statementId;
-    bucketId = AcidUtils.parseBucketId(getPath());

Review Comment:
   I am not very sure about this part of code. But just going through the code. 
the ``bucketId= AcidUtils.parseBucketId(getPath());``, the Path can change also 
if someone calls ``readFields``. It does this in the super class:
   ```
     public void readFields(DataInput in) throws IOException {
       file = new Path(Text.readString(in));
   ```
   the `file` is what returns from ``getPath()``, So, I am not sure if it is a 
good idea to just have that set in just the constructor.
   
   Moreover if any other change in Hadoop comes, which bothers the path in some 
other method post the construction invocation, again we will be prone to may be 
some silent errors.
   
   Second, there is a legacy protected constructor which doesn't have path and 
expect ``readFields`` to do the magic for it, if some thirdparty 
application/project is still using that by any means, that will get screwed, 
So, we have to remove that constructor also I feel.
   
   Would require someone more experienced to tell, how safe this is



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to