[ https://issues.apache.org/jira/browse/AVRO-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian J. updated AVRO-2438: ------------------------------- Description: Having a schema fragment like this: {code:java} { "name": "ownerId", "type": [ "null", { "type": "string", "java-class": "java.net.URI" } ], "default": null }{code} can be perfectly deserialized in a generated POJO with {code:java} @org.apache.avro.specific.AvroGenerated public class MyAvroDataObject extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... @Deprecated public java.net.URI ownerId;{code} as {{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the {{stringClassCache}} with {code:java} {"type":"string","java-class":"java.net.URI"}=class java.net.URI{code} The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}. On the other hand, {{deepCopy}} only considers the schema-type of the field and turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}} the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}} case which then causes a {{ClassCastException}}: {noformat} java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to java.net.URI at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104) at org.apache.avro.generic.GenericData.setField(GenericData.java:660) at org.apache.avro.generic.GenericData.setField(GenericData.java:677) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat} The following dirty hack seems to avoid the issue - but is not in sync with the {{stringClassCache}} which should be consulted, too: {code:java} case STRING: // Strings are immutable if (value instanceof String) { return (T)value; } // Dirty Harry 9 3/4 start // URIs are immutable and are probably modeled as an URI itself // TODO: Check with stringClassCache & the schema else if ((value instanceof URI) && URI.class.getName().equals(schema.getProp("java-class")) ) { return (T)value; } // Dirt Harry 9 3/4 end // Some CharSequence subclasses are mutable, so we still need to make // a copy else if (value instanceof Utf8) { // Utf8 copy constructor is more efficient than converting // to string and then back to Utf8 return (T)new Utf8((Utf8)value); } return (T)new Utf8(value.toString()); {code} Also tried with Avro {{1.10-SNAPSHOT}} of 2019-06-20 / {{2d3b1fe7efd865639663ba785877182e7e038c45}} due to [https://github.com/apache/avro/pull/329] - but the issue remains. was: Having a schema fragment like this: {code:java} { "name": "ownerId", "type": [ "null", { "type": "string", "java-class": "java.net.URI" } ], "default": null }{code} can be perfectly deserialized in a generated POJO with {code:java} @org.apache.avro.specific.AvroGenerated public class MyAvroDataObject extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... @Deprecated public java.net.URI ownerId;{code} as {{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the {{stringClassCache}} with {code:java} {"type":"string","java-class":"java.net.URI"}=class java.net.URI{code} The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}. On the other hand, {{deepCopy}} only considers the schema-type of the field and turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}} the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}} case which then causes a {{ClassCastException}}: {noformat} java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to java.net.URI at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104) at org.apache.avro.generic.GenericData.setField(GenericData.java:660) at org.apache.avro.generic.GenericData.setField(GenericData.java:677) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102) at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat} The following dirty hack seems to avoid the issue - but is not in sync with the {{stringClassCache}} which should be consulted, too: {code:java} case STRING: // Strings are immutable if (value instanceof String) { return (T)value; } // Dirty Harry 9 3/4 start // URIs are immutable and are probably modeled as an URI itself // TODO: Check with stringClassCache & the schema else if ((value instanceof URI) && URI.class.getName().equals(schema.getProp("java-class")) ) { return (T)value; } // Dirt Harry 9 3/4 end // Some CharSequence subclasses are mutable, so we still need to make // a copy else if (value instanceof Utf8) { // Utf8 copy constructor is more efficient than converting // to string and then back to Utf8 return (T)new Utf8((Utf8)value); } return (T)new Utf8(value.toString()); {code} Also tried with Avro `1.10-SNAPSHOT` of 2019-06-20 / 2d3b1fe7efd865639663ba785877182e7e038c45 due to [https://github.com/apache/avro/pull/329] - but the issue remains. > SpecificData.deepCopy() cannot be used with URI fields > ------------------------------------------------------ > > Key: AVRO-2438 > URL: https://issues.apache.org/jira/browse/AVRO-2438 > Project: Apache Avro > Issue Type: Bug > Components: java > Affects Versions: 1.9.0, 1.8.2 > Reporter: Sebastian J. > Priority: Major > > Having a schema fragment like this: > {code:java} > { > "name": "ownerId", > "type": [ > "null", > { > "type": "string", > "java-class": "java.net.URI" > } > ], > "default": null > }{code} > can be perfectly deserialized in a generated POJO with > {code:java} > @org.apache.avro.specific.AvroGenerated > public class MyAvroDataObject extends > org.apache.avro.specific.SpecificRecordBase implements > org.apache.avro.specific.SpecificRecord { > ... > @Deprecated public java.net.URI ownerId;{code} > as > {{GenericDatumReader.readString(Object, Schema, Decoder)}} uses via the > {{stringClassCache}} with > {code:java} > {"type":"string","java-class":"java.net.URI"}=class java.net.URI{code} > The {{URI}} class itself to rehydrate the value via {{newInstanceFromString}}. > > On the other hand, {{deepCopy}} only considers the schema-type of the field > and turns in {{org.apache.avro.generic.GenericData.deepCopy(Schema, T)}} > the {{URI}} value into an {{org.apache.avro.util.Utf8}} via the {{String}} > case which then causes a {{ClassCastException}}: > {noformat} > java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to > java.net.URI > at com.example.MyAvroDataObject.put(MyAvroDataObject.java:104) > at org.apache.avro.generic.GenericData.setField(GenericData.java:660) > at org.apache.avro.generic.GenericData.setField(GenericData.java:677) > at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1082) > at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1102) > at > org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1080){noformat} > > The following dirty hack seems to avoid the issue - but is not in sync with > the {{stringClassCache}} which should be consulted, too: > {code:java} > case STRING: > // Strings are immutable > if (value instanceof String) { > return (T)value; > } > // Dirty Harry 9 3/4 start > // URIs are immutable and are probably modeled as an URI itself > // TODO: Check with stringClassCache & the schema > else if ((value instanceof URI) > && URI.class.getName().equals(schema.getProp("java-class")) > ) { > return (T)value; > } > // Dirt Harry 9 3/4 end > // Some CharSequence subclasses are mutable, so we still need to make > // a copy > else if (value instanceof Utf8) { > // Utf8 copy constructor is more efficient than converting > // to string and then back to Utf8 > return (T)new Utf8((Utf8)value); > } > return (T)new Utf8(value.toString()); > {code} > > Also tried with Avro {{1.10-SNAPSHOT}} of 2019-06-20 / > {{2d3b1fe7efd865639663ba785877182e7e038c45}} due to > [https://github.com/apache/avro/pull/329] - but the issue remains. -- This message was sent by Atlassian JIRA (v7.6.3#76005)