dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1415869383


##########
lucene/core/src/java/org/apache/lucene/util/fst/FSTCompiler.java:
##########
@@ -120,22 +125,44 @@ public class FSTCompiler<T> {
   final float directAddressingMaxOversizingFactor;
   long directAddressingExpansionCredit;
 
-  final BytesStore bytes;
+  // the DataOutput to stream the FST bytes to
+  final DataOutput dataOutput;
+
+  // buffer to store bytes for the one node we are currently writing
+  final GrowableByteArrayDataOutput scratchBytes = new 
GrowableByteArrayDataOutput();
+
+  private long numBytesWritten;
+
+  /**
+   * Get an on-heap DataOutput that allows the FST to be read immediately 
after writing.

Review Comment:
   I added a new Test2BFSTOffHeap and running it:
   
   ```
   100000: 424 RAM bytes used; 39257811 FST bytes; 19189176 nodes; took 23 
seconds
   200000: 424 RAM bytes used; 78522623 FST bytes; 38378071 nodes; took 49 
seconds
   300000: 424 RAM bytes used; 117788163 FST bytes; 57567190 nodes; took 80 
seconds
   400000: 424 RAM bytes used; 157053095 FST bytes; 76756389 nodes; took 107 
seconds
   500000: 424 RAM bytes used; 196318494 FST bytes; 95945639 nodes; took 133 
seconds
   600000: 424 RAM bytes used; 235583412 FST bytes; 115134691 nodes; took 170 
seconds
   700000: 480 RAM bytes used; 274866378 FST bytes; 134324199 nodes; took 198 
seconds
   800000: 480 RAM bytes used; 314246540 FST bytes; 153513668 nodes; took 222 
seconds
   900000: 480 RAM bytes used; 353626848 FST bytes; 172703151 nodes; took 245 
seconds
   1000000: 480 RAM bytes used; 393006717 FST bytes; 191892620 nodes; took 277 
seconds
   1100000: 480 RAM bytes used; 432387052 FST bytes; 211082115 nodes; took 311 
seconds
   1200000: 480 RAM bytes used; 471766692 FST bytes; 230271461 nodes; took 334 
seconds
   1300000: 480 RAM bytes used; 511147081 FST bytes; 249461034 nodes; took 357 
seconds
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to