[ https://issues.apache.org/jira/browse/AVRO-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095807#comment-16095807 ]
Suraj Acharya commented on AVRO-2048: ------------------------------------- Seems like the code is not passing checkstyle. The artifacts are present here: https://builds.apache.org/job/PreCommit-AVRO-Build-TEST/32/artifact/component/patchprocess/test--lang_java.txt The value is {{0l}}. Change it to {{0L}} and it should pass. {code} [INFO] [INFO] --- maven-checkstyle-plugin:2.17:check (checkstyle-check) @ avro --- [INFO] Starting audit... /testptch/avro/lang/java/avro/src/main/java/org/apache/avro/io/BinaryDecoder.java:263:19: error: Should use uppercase 'L'. /testptch/avro/lang/java/avro/src/main/java/org/apache/avro/io/BinaryDecoder.java:268:10: error: Should use uppercase 'L'. Audit done. [INFO] There are 2 errors reported by Checkstyle 6.11.2 with checkstyle.xml ruleset. [ERROR] src/main/java/org/apache/avro/io/BinaryDecoder.java:[263,19] (misc) UpperEll: Should use uppercase 'L'. [ERROR] src/main/java/org/apache/avro/io/BinaryDecoder.java:[268,10] (misc) UpperEll: Should use uppercase 'L'. {code} > Avro Binary Decoding - Gracefully Handle Long Strings > ----------------------------------------------------- > > Key: AVRO-2048 > URL: https://issues.apache.org/jira/browse/AVRO-2048 > Project: Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.7.7, 1.8.2 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Minor > Attachments: AVRO-2048.1.patch > > > According to the > [specs|https://avro.apache.org/docs/1.8.2/spec.html#binary_encode_primitive]: > bq. a string is encoded as a *long* followed by that many bytes of UTF-8 > encoded character data. > However, that is currently not being adhered to: > {code:title=org.apache.avro.io.BinaryDecoder} > @Override > public Utf8 readString(Utf8 old) throws IOException { > int length = readInt(); > Utf8 result = (old != null ? old : new Utf8()); > result.setByteLength(length); > if (0 != length) { > doReadBytes(result.getBytes(), 0, length); > } > return result; > } > {code} > The first thing the code does here is to load an *int* value, not a *long*. > Because of the variable length nature of the size, this will mostly work. > However, there may be edge-cases where the serializer is putting in large > length values erroneously or nefariously. Let us gracefully detect such > scenarios and more closely adhere to the spec. -- This message was sent by Atlassian JIRA (v6.4.14#64029)