[ 
https://issues.apache.org/jira/browse/AVRO-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated AVRO-4249:
---------------------------------
    Labels: pull-request-available  (was: )

> Avoid parsing a schema if we have a cache
> -----------------------------------------
>
>                 Key: AVRO-4249
>                 URL: https://issues.apache.org/jira/browse/AVRO-4249
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.12.1
>            Reporter: Michael Skells
>            Priority: Minor
>              Labels: pull-request-available
>   Original Estimate: 24h
>          Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> In a environment where we have many small avro files, and large schemas we 
> see multiple TB of schemas being generated when we open files
> As the schema is a well defined structure, and the transform from a 
> serialised transform is expensive, is seems logical to provide a cache. If 
> the serialised form of the schema is the same, then the parsed version can be 
> cached
>  
> I can provide a PR for discussion



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to