iprithv commented on code in PR #15982:
URL: https://github.com/apache/lucene/pull/15982#discussion_r3197561777
##########
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene102/Lucene102BinaryQuantizedVectorsWriter.java:
##########
@@ -714,11 +718,22 @@ public float[] copyValue(float[] vectorValue) {
throw new UnsupportedOperationException();
}
+ /**
+ * Returns the RAM usage of quantization-specific state only (magnitudes,
dimensionSums, shallow
+ * object overhead). The underlying flat vector data is tracked separately
by the
+ * rawVectorDelegate at the writer level to avoid double-counting.
+ */
+ long quantizationOverheadBytesUsed() {
+ long size = SHALLOW_SIZE;
+ size += magnitudes.ramBytesUsed();
+ size += RamUsageEstimator.sizeOf(dimensionSums);
+ return size;
+ }
+
@Override
public long ramBytesUsed() {
- long size = SHALLOW_SIZE;
+ long size = quantizationOverheadBytesUsed();
size += flatFieldVectorsWriter.ramBytesUsed();
Review Comment:
Yes, rawVectorDelegate is now the single source of truth for all flat vector
data (both byte and float32).
No double counting happens, FieldWriter.flatFieldVectorsWriter is the same
Java object that rawVectorDelegate holds internally as the per-field writer,
it's what this.rawVectorDelegate.addField(fieldInfo) returns and then passes
into new FieldWriter(fieldInfo, rawVectorDelegate). So
rawVectorDelegate.ramBytesUsed() already accounts for those float vectors.
The writer level loop then calls field.quantizationOverheadBytesUsed(),
which only counts the FieldWriter shell + magnitudes + dimensionSums, NOT
flatFieldVectorsWriter. FieldWriter.ramBytesUsed() (which does include
flatFieldVectorsWriter.ramBytesUsed()) is never called from the writer level
accounting. It's there solely for the Accountable interface. So each byte of
flat float data is counted exactly once through rawVectorDelegate.
Thanks @shubhamvishu!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]