stevedlawrence commented on code in PR #1603:
URL: https://github.com/apache/daffodil/pull/1603#discussion_r2628235072


##########
daffodil-core/src/main/scala/org/apache/daffodil/io/LocalBuffer.scala:
##########
@@ -33,7 +33,13 @@ abstract class LocalBuffer[T <: java.nio.Buffer] {
   def getBuf(length: Long) = {
     Assert.usage(length <= Int.MaxValue)
     if (tempBuf.isEmpty || tempBuf.get.capacity < length) {
-      tempBuf = Maybe(allocate(length.toInt))
+      // allocate a buffer that can store the required length, but with a 
minimum size. The
+      // majority of LocalBuffers should be smaller than this minimum size and 
so should avoid
+      // costly reallocations, while still being small enough that the JVM 
should have no
+      // problem quickly allocating it
+      val minBufferSize = 1024
+      val allocationSize = math.max(length.toInt, minBufferSize)
+      tempBuf = Maybe(allocate(allocationSize))

Review Comment:
   I tested allocating the exact size needed vs allocating one large buffer and 
reusing. I really didn't see much of a difference. Allocate once + reuse was 
about the same or like 3-5% faster, but that could be within the JVM noise.
   
   It's also possible it's dependent on the specific file and format. For very 
small formats with little reuse it's probably faster to just allocate exact 
sizes as needed. But if there are a lot of strings, maybe avoiding the 
allocations and GC overhead makes a differences. I can't really say without 
more testing, but I don't think there's a clear winner.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to