mayya-sharipova commented on code in PR #12436:
URL: https://github.com/apache/lucene/pull/12436#discussion_r1275409313
##########
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##########
@@ -621,6 +621,12 @@ private void initializeFieldInfo(PerField pf) throws
IOException {
final Sort indexSort = indexWriterConfig.getIndexSort();
validateIndexSortDVType(indexSort, pf.fieldName, s.docValuesType);
}
+ if (s.vectorDimension != 0) {
+ validateMaxVectorDimension(
+ pf.fieldName,
+ s.vectorDimension,
+ indexWriterConfig.getCodec().knnVectorsFormat().getMaxDimensions());
+ }
Review Comment:
@jpountz Thank you for the additional feedback.
> I worry that this adds a hashtable lookup on a hot code path. Maybe it's
not that bad for vectors, which are slow to index anyway, but I'd rather avoid
it.
This is not really a hot code path. We ask for
`getCodec().knnVectorsFormat().getMaxDimensions` in the `initializeFieldInfo`
function, that happens only once per a new field per segment.
> What about making the codec responsible for checking the limit?
Thanks for the suggestion, I experimented with this idea, and encountered
the following difficulty with it:
- we need to create a new `FieldInfo` before passing it to
`KnnFieldVectorsWriter<?> addField(FieldInfo fieldInfo)`.
- The way we create it is : `FieldInfo fi = fieldInfos.add(` by adding to
the global fieldInfos. This means that if `FieldInfo` contains incorrect number
of dimensions, it will be stored like this in the global fieldInfos, and we
can't change it (for example with a second document with correct number of
dims).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]