[ 
https://issues.apache.org/jira/browse/AVRO-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043533#comment-17043533
 ] 

Erik Erlandson commented on AVRO-2748:
--------------------------------------

Yes, matching schemas once, during DatumReader construction, is exactly what I 
am thinking.  And I think you hit on the case I was confused about - resolving 
"union" types, where union options might or might not be compatible.

One idea I was toying with was doing once-up-front schema matching IF such 
matches are unambiguous - i.e. if no union types are in play. Possibly I am 
still missing some subtleties, but if neither the write nor read schema have 
unions, then it still seems possible to either match or fail up front and not 
have to do it again. Schemas with no union types seems like a pretty relevant 
use case.

> python schema resolution occurs on every read
> ---------------------------------------------
>
>                 Key: AVRO-2748
>                 URL: https://issues.apache.org/jira/browse/AVRO-2748
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.9.2
>            Reporter: Erik Erlandson
>            Priority: Minor
>
> In python, the schema resolution appears to be happening on each read 
> operation. I'm not an avro expert but in my perusing through the python io 
> code I haven't yet noticed a reason that the schema resolution couldn't 
> happen once up front, during the construction of DataFileReader, when it 
> first loads the write_schema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to