dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1415869383
##########
lucene/core/src/java/org/apache/lucene/util/fst/FSTCompiler.java:
##########
@@ -120,22 +125,44 @@ public class FSTCompiler<T> {
final float directAddressingMaxOversizingFactor;
long directAddressingExpansionCredit;
- final BytesStore bytes;
+ // the DataOutput to stream the FST bytes to
+ final DataOutput dataOutput;
+
+ // buffer to store bytes for the one node we are currently writing
+ final GrowableByteArrayDataOutput scratchBytes = new
GrowableByteArrayDataOutput();
+
+ private long numBytesWritten;
+
+ /**
+ * Get an on-heap DataOutput that allows the FST to be read immediately
after writing.
Review Comment:
I added a new Test2BFSTOffHeap and running it:
```
100000: 424 RAM bytes used; 39257811 FST bytes; 19189176 nodes; took 23
seconds
200000: 424 RAM bytes used; 78522623 FST bytes; 38378071 nodes; took 49
seconds
300000: 424 RAM bytes used; 117788163 FST bytes; 57567190 nodes; took 80
seconds
400000: 424 RAM bytes used; 157053095 FST bytes; 76756389 nodes; took 107
seconds
500000: 424 RAM bytes used; 196318494 FST bytes; 95945639 nodes; took 133
seconds
600000: 424 RAM bytes used; 235583412 FST bytes; 115134691 nodes; took 170
seconds
700000: 480 RAM bytes used; 274866378 FST bytes; 134324199 nodes; took 198
seconds
800000: 480 RAM bytes used; 314246540 FST bytes; 153513668 nodes; took 222
seconds
900000: 480 RAM bytes used; 353626848 FST bytes; 172703151 nodes; took 245
seconds
1000000: 480 RAM bytes used; 393006717 FST bytes; 191892620 nodes; took 277
seconds
1100000: 480 RAM bytes used; 432387052 FST bytes; 211082115 nodes; took 311
seconds
1200000: 480 RAM bytes used; 471766692 FST bytes; 230271461 nodes; took 334
seconds
1300000: 480 RAM bytes used; 511147081 FST bytes; 249461034 nodes; took 357
seconds
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]