pgaref commented on a change in pull request #748:
URL: https://github.com/apache/orc/pull/748#discussion_r676525086
##########
File path: java/core/src/java/org/apache/orc/impl/StringHashTableDictionary.java
##########
@@ -173,10 +170,18 @@ public int size() {
/**
* Compute the hash value and find the corresponding index.
- *
*/
int getIndex(Text text) {
- return Math.floorMod(text.hashCode(), capacity);
+ return getIndex(text.getBytes(), 0, text.getLength());
+ }
+
+ /**
+ * Compute the hash value and find the corresponding index. Uses same
+ * implementation as {@code Text#hashCode()}.
+ */
+ int getIndex(final byte[] bytes, final int offset, final int length) {
+ return Math.floorMod(WritableComparator.hashBytes(bytes, offset, length),
Review comment:
Thats exactly what I had in mind with my comment above.. Seems like we
are trying to get rid of the Hadoop dependency but implicit deps could cause
other issues down the line..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]