[
https://issues.apache.org/jira/browse/AVRO-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated AVRO-4249:
---------------------------------
Labels: pull-request-available (was: )
> Avoid parsing a schema if we have a cache
> -----------------------------------------
>
> Key: AVRO-4249
> URL: https://issues.apache.org/jira/browse/AVRO-4249
> Project: Apache Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.12.1
> Reporter: Michael Skells
> Priority: Minor
> Labels: pull-request-available
> Original Estimate: 24h
> Time Spent: 10m
> Remaining Estimate: 23h 50m
>
> In a environment where we have many small avro files, and large schemas we
> see multiple TB of schemas being generated when we open files
> As the schema is a well defined structure, and the transform from a
> serialised transform is expensive, is seems logical to provide a cache. If
> the serialised form of the schema is the same, then the parsed version can be
> cached
>
> I can provide a PR for discussion
--
This message was sent by Atlassian Jira
(v8.20.10#820010)