[
https://issues.apache.org/jira/browse/SOLR-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817824#comment-17817824
]
Chris M. Hostetter commented on SOLR-17164:
-------------------------------------------
FWIW: I don't have the bandwidth to fully dig into this right at the moment,
but i hope to in the next few weeks or so.
If someone else wants to tackle this in the mean time go right ahead.
Here's a rough psuedo code outline of how i think this would be doable in a
backcompat way (w/o needing a new function name)...
{noformat}
final String arg1Str = fp.parseArg();
throw error if arg1Str is null or !fp.hasMoreArguments()
final boolean constVec = '[' == fp.sp.peek();
final String arg2Str = constVec ? null : fp.parseArg();
if (fp.hasMoreArguments() && null != arg2Str) {
parse existing arg1Str/arg2Str to get vectorEncoding/similarityFunction
then continue with existing vectorSimilarity() valuesource parser logic &
return
}
final SchemaField field1 = ... lookup arg1Str in schema ...
throw error if field1.getType() is not DenseVectorField
final VectorEncoding vectorEncoding = field1.getFieldType().get...
final VectorSimilarityFunction similarityFunction =
field1.getFieldType().get...
final ValueSource v1 = field1.getType().getValueSource(...)
ValueSource v2 = null;
if (constVec) {
v2 = fp.parseValueSource( ... use vectorEncoding for flags ... )
} else {
final SchemaField field2 = ... lookup arg2Str in schema ...
throw error if field2.getType() is not DenseVectorField
throw error if vectorEncoding or similarityFunction don't match
field2.getType()
v2 = field2.getType().getValueSource(...)
}
return new $(vectorEncoding)VectorSimilarityFunction(similarityFunction, v1,
v2)
{noformat}
...but i may be overlooking some nuances, and we'd obviously want to ensure we
add a lot of good tests (particularly of the error cases)
(Worst case scenerio, a new "function name" could be picked for this two arg
version ... ala: {{fieldVectorSimilarity(vecField,vecField|constantVec)}} ...
or something like that
> Add 2 arg variant of vectorSimilarity() function
> ------------------------------------------------
>
> Key: SOLR-17164
> URL: https://issues.apache.org/jira/browse/SOLR-17164
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Chris M. Hostetter
> Priority: Major
>
> Solr's current 4 argument
> {{vectorySimilarity(vectorEncoding,similarityFunction,vec1,vec2)}} function
> is really awkward to use for the (seemingly common) situation where you just
> want to know the similarity between a field and a constant vector, or
> (probably less common) between two fields of the same type.
> The first two (currently) mandatory arguments to {{vectorySimilarity()}}
> ({{{}vectorEncoding{}}} and {{{}similarityFunction{}}}) are already mandatory
> properties of {{{}DenseVectorField{}}}. IIUC the only reason these arguments
> are required is in the (seemingly uncommon?) situation where you might want
> to compute the similarity of two vector constants, so the function needs to
> know what {{vectorEncoding}} and {{similarityFunction}} to use.
>
> ----
>
> It would be really nice to support a simplified 2 argument variant of
> {{vectorySimilarity()}} such that:
> * the first argument must be the name of a {{DenseVectorField}} field
> * the second argument must be either:
> ** A vector constant
> *** in which case the {{vectorEncoding}} use to parse the constant is
> infered from the fieldType properties of the first argument
> ** Or the name of a second {{DenseVectorField}} field
> *** in which case the {{vectorEncoding}} and {{similarityFunction}} of the
> two fields must match
> * The ValueSource returned should be based on the configured
> {{vectorEncoding}} & {{similarityFunction}} of the field(s)
> Examples...
> {noformat}
> vectorySimilarity(title_float_vec_dim4, [1.0,2.0,3.0,4.0])
> ...or...
> vectorySimilarity(title_float_vec_dim4, body_float_vec_dim4)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]