RyanSkraba commented on a change in pull request #955:
URL: https://github.com/apache/avro/pull/955#discussion_r485435963
##########
File path: lang/java/avro/src/main/java/org/apache/avro/util/Utf8.java
##########
@@ -119,16 +120,21 @@ public Utf8 setByteLength(int newLength) {
}
this.length = newLength;
this.string = null;
- this.hasHash = false;
+ this.hash = 0;
return this;
}
/** Set to the contents of a String. */
public Utf8 set(String string) {
- this.bytes = getBytesFor(string);
- this.length = bytes.length;
+ byte[] bytes = getBytesFor(string);
+ int length = bytes.length;
+ if (length > MAX_LENGTH) {
+ throw new AvroRuntimeException("String length " + length + " exceeds
maximum allowed");
+ }
Review comment:
I agree, that seems reasonable to me! Do you want to make the change in
the PR directly or create a JIRA?
##########
File path: lang/java/avro/src/main/java/org/apache/avro/util/Utf8.java
##########
@@ -119,16 +120,21 @@ public Utf8 setByteLength(int newLength) {
}
this.length = newLength;
this.string = null;
- this.hasHash = false;
+ this.hash = 0;
Review comment:
Hello! For consistency with [how Schema caches the
hash](https://github.com/apache/avro/blob/42d81d70e7b9409d3b17fbcc9ea102876b45b945/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L111),
what do you think about using Integer.MIN_VALUE instead of 0? Not a big deal,
except that all "zeroed" byte arrays will otherwise have hashCode 0 and be
recalculated every time.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]