[ https://issues.apache.org/jira/browse/AVRO-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842650#comment-17842650 ]
Oscar Westra van Holthe - Kind commented on AVRO-3980: ------------------------------------------------------ I'm sorry, but I have not been able to reproduce the error with this information. I'll try with some observations. As I understand, you're using the following schema (with probably another toplevel name) with a raw encoding: {code:json} {"type": "record", "name": "anonymous", "fields": [ {"name": "statuses", "type": { "type": "array", "items": { "type": "record", "name": "com.entity.avro.StatusAvro", "fields": [ {"name": "status", "type": ["null", "string"]}, {"name": "reason", "type": ["null", "string"]}, {"name": "validFor", "type": { "type": "record", "name": "com.entity.avro.ValidForAvro", "fields": [ {"name": "start", "type": "long"}, {"name": "end", "type": "long"}] }} ] }}} ]} {code} Am I correct in assuming you're using the exact same schema for both writing with Avro 1.11.1, and reading with Avro 1.11.3? That aside, you should also have (somewhere) a conversion from the timestamps in the JSON (or {{{}Instant{}}}s in the object) to {{{}{{long}}{}}}. The reason is that there's no logical type specified to do this for you. Is this the case? Perhaps it's easier to read & write not the raw encoding, but the single-message encoding. Although this adds a 10-byte header, it is also safer for long-lived data, such as stored in a database. The reason is that is can resolve conflicts caused by different, but compatible, schemas. Also, it'll fail fast when the correct write schema is not present, and not give weird errors like the one you encountered. This could look something like this: {code:java} public static <T extends SpecificRecord> ByteBuffer serialize(T obj) throws IOException { // Cache the encoder created in the next two lines by object type Schema schema = obj.getSchema(); BinaryMessageEncoder<T> encoder = new BinaryMessageEncoder<>(SpecificData.getForSchema(schema), schema); return encoder.encode(obj); } public static <T extends SpecificRecord> T deserialize(ByteBuffer byteBuffer, Class<T> type) throws IOException, NoSuchMethodException, InvocationTargetException, IllegalAccessException { // Cache the next lines in global definitions, // or implement a resolver for all schemas for all types stored anywhere as Avro binary data SchemaStore.Cache schemaStore = new SchemaStore.Cache(); // schemaStore.addSchema(oldSchema); // Call with all schema versions for which data is stored in the database // Cache the decoder created in the next 2 lines by target type Method createDecoder = type.getDeclaredMethod("createDecoder", SchemaStore.class); BinaryMessageDecoder<T> decoder = (BinaryMessageDecoder<T>) createDecoder.invoke(null, schemaStore); return decoder.decode(byteBuffer); } {code} > Error to deserialize field of type Long after upgrade from 1.11.1 to 1.11.3 > --------------------------------------------------------------------------- > > Key: AVRO-3980 > URL: https://issues.apache.org/jira/browse/AVRO-3980 > Project: Apache Avro > Issue Type: Bug > Components: java, logical types > Affects Versions: 1.11.3 > Reporter: Jari Louvem > Priority: Critical > > After we upgraded Avro library and avro-maven-plugin from version 1.11.1 to > 1.11.3 and > we started to get the error "cannot read collections larger than 2147483639 > items in java library". > > This error is generated by SystemLimitException.checkMaxCollectionLength. > The data that we are trying to deserialize (using avro 1.11.3) was serealized > using avro 1.11.1. > The object that we are trying to deserealize is: > { > "name": "statuses", > "type": { > "type": "array", > "items": "com.entity.avro.StatusAvro" > } > } > { > "name": "statuses", > "type": { > "type": "array", > "items": { > "name": "StatusAvro", > "type": "record", > "namespace": "com.entity.avro", > "fields": [ > { > "name": "status", > "type": [ > "null", > "string" > ] > }, > { > "name": "reason", > "type": [ > "null", > "string" > ] > }, > { > "name": "validFor", > "type": "com.entity.avro.ValidForAvro" > } > ] > } > } > } > { > "name": "validFor", > "type": { > "name": "ValidForAvro", > "type": "record", > "namespace": "com.entity.avro", > "fields": [ > { > "name": "start", > "type": "long" > }, > { > "name": "end", > "type": "long" > } > ] > } > } > This is an example of the objects listed above: > "statuses": [ > { > "status": "INIT", > "reason": "Final_New_Reason", > "validFor": { > "start": "2020-01-30T11:45:00.839Z", > "end": "2030-01-23T06:58:21.563Z" > } > } > ] > The problem is that the array has only one item as shown above, so why is it > throwing an error of the collection is too long? -- This message was sent by Atlassian Jira (v8.20.10#820010)