[ https://issues.apache.org/jira/browse/AVRO-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997115#comment-12997115 ]
Scott Carey commented on AVRO-656: ---------------------------------- I solved the Unit test error. ResolvingGrammarGenerator.bestBranch() was only checking for records, it needs to check fixed and enum too: {code} int j = 0; for (Schema b : r.getTypes()) { if (vt == b.getType()) - if (vt == Schema.Type.RECORD) { - String vname = w.getName(); - if (vname == null || vname.equals(b.getName())) + if (vt == Schema.Type.RECORD || vt == Schema.Type.ENUM || + vt == Schema.Type.FIXED) { + String vname = w.getFullName(); + String bname = b.getFullName(); + if ((vname != null && vname.equals(bname)) + || vname == bname) return j; } else return j; {code} {quote}The intent there was to be back-compatible, to not require a schema be passed to the constructor, but perhaps that's not worth it.{quote} Perhaps we can allow null to equal null, and strings to match, but disallow null to match strings. This will cause problems if people mix/match them, but will work with the old constructors if the user is consistent. It would cause problems if mix/matched, and So maybe that is not worth it. > writing unions with multiple records, fixed or enums can choose wrong branch > ----------------------------------------------------------------------------- > > Key: AVRO-656 > URL: https://issues.apache.org/jira/browse/AVRO-656 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.4.0 > Reporter: Doug Cutting > Assignee: Doug Cutting > Priority: Blocker > Fix For: 1.5.0 > > Attachments: AVRO-656.patch, AVRO-656.patch, AVRO-656.patch, > AVRO-656.patch, AVRO-656.patch > > > According to the specification, a union may contain multiple instances of a > named type, provided they have different names. There are several bugs in > the Java implementation of this when writing data: > - for record, only the short-name of the record is checked, so the branch > for a record of the same name in a different namespace may be used by mistake > - for enum and fixed, the name of the record is not checked, so the first > enum or fixed in the union will always be assumed when writing. in many > cases this may cause the wrong data to be written, potentially corrupting > output. > This is not a regression. This has never been implemented correctly by Java. > Python and Ruby never check names, but rather perform a full, recursive > validation of content. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira