ayushtkn commented on code in PR #3606:
URL: https://github.com/apache/hive/pull/3606#discussion_r979098870
##########
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java:
##########
@@ -384,7 +384,6 @@ public void parse(Configuration conf, Path rootPath) throws
IOException {
OrcRawRecordMerger.TransactionMetaData.findWriteIDForSynthetcRowIDs(getPath(),
rootPath, conf);
writeId = tmd.syntheticWriteId;
stmtId = tmd.statementId;
- bucketId = AcidUtils.parseBucketId(getPath());
Review Comment:
I am not very sure about this part of code. But just going through the code.
the ``bucketId= AcidUtils.parseBucketId(getPath());``, the Path can change also
if someone calls ``readFields``. It does this in the super class:
```
public void readFields(DataInput in) throws IOException {
file = new Path(Text.readString(in));
```
the `file` is what returns from ``getPath()``, So, I am not sure if it is a
good idea to just have that set in just the constructor.
Moreover if any other change in Hadoop comes, which bothers the path in some
other method post the construction invocation, again we will be prone to may be
some silent errors.
Second, there is a legacy protected constructor which doesn't have path and
expect ``readFields`` to do the magic for it, if some thirdparty
application/project is still using that by any means, that will get screwed,
So, we have to remove that constructor also I feel.
Would require someone more experienced to tell, how safe this is
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]