stevedlawrence commented on code in PR #1603:
URL: https://github.com/apache/daffodil/pull/1603#discussion_r2627462497
##########
daffodil-core/src/main/scala/org/apache/daffodil/io/LocalBuffer.scala:
##########
@@ -33,7 +33,13 @@ abstract class LocalBuffer[T <: java.nio.Buffer] {
def getBuf(length: Long) = {
Assert.usage(length <= Int.MaxValue)
if (tempBuf.isEmpty || tempBuf.get.capacity < length) {
- tempBuf = Maybe(allocate(length.toInt))
+ // allocate a buffer that can store the required length, but with a
minimum size. The
+ // majority of LocalBuffers should be smaller than this minimum size and
so should avoid
+ // costly reallocations, while still being small enough that the JVM
should have no
+ // problem quickly allocating it
+ val minBufferSize = 1024
+ val allocationSize = math.max(length.toInt, minBufferSize)
+ tempBuf = Maybe(allocate(allocationSize))
Review Comment:
I think the reuse worked (I'll double check as you suggest), the issue is
that the reuse is only within a single parse--it's only local to the PState. So
every time we call parse() we will allocate a new 1MB buffer (if there is a
specific length string in the format), and reuse that buffer for the lifetime
of that single parse. When that parse ends, we let it get garbage collected
along with the PState, and a new one will be allocated the next time parse() is
called.
For files that are very small (e.g. <1KB) the 1 MB allocation makes a
noticable difference. For larger files this doesn't make as much of a
difference because 1MB isn't very much compared to all the other allocations
daffodil does.
An alternative would be to make these a thread local and then share them
among threads, then we could go back to the 1MB buffer size and you won't only
take a single hit per thread. But I don't think the 1MB buffer size is all that
important. It's going to be very rare for a single specified length string to
be very big, likely no where close to 1MB.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]