RyanSkraba commented on a change in pull request #955:
URL: https://github.com/apache/avro/pull/955#discussion_r485435963



##########
File path: lang/java/avro/src/main/java/org/apache/avro/util/Utf8.java
##########
@@ -119,16 +120,21 @@ public Utf8 setByteLength(int newLength) {
     }
     this.length = newLength;
     this.string = null;
-    this.hasHash = false;
+    this.hash = 0;
     return this;
   }
 
   /** Set to the contents of a String. */
   public Utf8 set(String string) {
-    this.bytes = getBytesFor(string);
-    this.length = bytes.length;
+    byte[] bytes = getBytesFor(string);
+    int length = bytes.length;
+    if (length > MAX_LENGTH) {
+      throw new AvroRuntimeException("String length " + length + " exceeds 
maximum allowed");
+    }

Review comment:
       I agree, that seems reasonable to me!  Do you want to make the change in 
the PR directly or create a JIRA?

##########
File path: lang/java/avro/src/main/java/org/apache/avro/util/Utf8.java
##########
@@ -119,16 +120,21 @@ public Utf8 setByteLength(int newLength) {
     }
     this.length = newLength;
     this.string = null;
-    this.hasHash = false;
+    this.hash = 0;

Review comment:
       Hello!  For consistency with [how Schema caches the 
hash](https://github.com/apache/avro/blob/42d81d70e7b9409d3b17fbcc9ea102876b45b945/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L111),
 what do you think about using Integer.MIN_VALUE instead of 0?  Not a big deal, 
except that all "zeroed" byte arrays will otherwise have hashCode 0 and be 
recalculated every time.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to