[ 
https://issues.apache.org/jira/browse/AVRO-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728834#action_12728834
 ] 

Doug Cutting commented on AVRO-75:
----------------------------------

> This is the only place the word "unset" is used in the doc [ ... ]

That may be a bug.  We don't currently say what happens when a reader has a 
field that a writer didn't write and where no default value is specified in the 
reader's schema.  In this case we might provide some implementation 
flexibility.  An optimized implementation might have an int field that's just 
set to zero in this case, with no means to differentiate this from an actual 
zero value for the field, or it might provide a means to tell whether this 
field was set or not.  I am not yet convinced that the spec should mandate this 
behavior, but might rather define such fields as "unset", which may or may not 
be detectable, depending on the implementation.

This gets back to the issue of optional/required fields.  I think the current 
intent is to treat all fields as optional, that, if a field must have a valid 
value then one can specify a default value in the schema rather than require 
that all writers already have that field.

> I propose that we declare this case an error [ ... ]

"Unset" may make sense in this case too.  In Java's reflect and specific API's, 
an Enum instance could be null.  This is somewhat analogous to a field that's 
been added to the writer but not yet to the reader.  A reader that requires 
this might provide a default value that would be used when either the writer 
does not provide a value or the writer provides an unknown value.

That said, I don't have a strong feeling and would be willing to make this an 
error if others can explain why that should be preferred.

> GenericDatumReader should be updated to throw an error in this case.

Or, if we decide that "unset" is useful here, we could have it use null or the 
default in this case.  We'd then need to update ReflectDatumReader too, as it 
currently throws an exception in this case.  In either case, the code does not 
conform to the spec.


> Clarify resolution for enums (and fix code)
> -------------------------------------------
>
>                 Key: AVRO-75
>                 URL: https://issues.apache.org/jira/browse/AVRO-75
>             Project: Avro
>          Issue Type: Bug
>          Components: spec
>            Reporter: Raymie Stata
>            Assignee: Doug Cutting
>
> The current resolution rule for enum's says: "if the writer's symbol is not 
> present in the reader's enum, then the enum value is unset."  This is the 
> only place the word "unset" is used in the doc, it's not clear what you mean. 
>  The code seems to be inconsistent: GenericDatumReader will happily return a 
> symbol the reader doesn't understand; ReflectDatumReader will probably throw 
> a class-not-found exception; ResolvingDecoder throws an error.
> I propose that we declare this case an error, i.e., rewrite the spec to "if 
> the writer's symbol is not listed in the reader's enum, an error is 
> signaled."  GenericDatumReader should be updated to throw an error in this 
> case.
> If we decide to stick with the "unset" language, we need to define what 
> "unset" means (and, if necessary, update ReflectDatumReader and 
> ResolvingDecoder).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to