[
https://issues.apache.org/jira/browse/DAFFODIL-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence resolved DAFFODIL-2851.
--------------------------------------
Resolution: Fixed
Fixed in commit d28fb524a7187a03c662381f5e222e4086461f9c
> Excessive alloations in StringOfSpecifiedLengthMixin
> ----------------------------------------------------
>
> Key: DAFFODIL-2851
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2851
> Project: Daffodil
> Issue Type: Bug
> Components: Back End, Performance
> Reporter: Steve Lawrence
> Assignee: Steve Lawrence
> Priority: Major
> Fix For: 4.1.0
>
>
> The StringOfSpecifiedLengthMixin passes in the value of the
> "maximumSimpleElementSizeInCharacters" tunable to the getSomeString function:
> https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/StringLengthParsers.scala#L89-L94
> The getSomeString function calls withLocalCharBuffer which allocates a char
> buffer of that size where it will decode the string. Currently, the tunable
> defaults to 1MB. This size is pretty large, large enough to be a noticeable
> contributor to allocations and cpu usage when profiling.
> Fortunately, the allocated char buffer is cached and reused during the parse
> (though each parse allocates a new one), so it's only a one time penalty per
> parse. But most files are not going to have single strings nearly that large
> so this large allocation is just a waste.
> We should consider ways to reduce this allocation. Maybe simply decrease the
> tunable? Or maybe change the logic so StringOfSpecifiedLength allocates a
> much smaller amount, and grows the buffer if needed, maybe taking into
> account bitLimit? Or maybe the buffer is shared among different parses in a
> ThreadLocal, so we still allocate a large buffer, but the penalty is only
> once per thread instead of once per parse? Likely other options...
--
This message was sent by Atlassian Jira
(v8.20.10#820010)