That would work for me, though this is something that I would like to be
documented as not recommended.

On Thu, Nov 10, 2022 at 2:33 PM Alexey Gorlenko <agorlen...@gmail.com>
wrote:

> I think we can support both parameters: k and threshold. And if we need to
> get all docs by the threshold, we just will set k == Integer.MAX_VALUE.
>
> чт, 10 нояб. 2022 г. в 12:43, Adrien Grand <jpou...@gmail.com>:
>
>> I wonder if it would actually be a good idea to support filtering _only_
>> based on distance. In the worst case scenario, this may require traversing
>> the whole HNSW graph and would run in linear time with the number of
>> vectors, with a high constant factor since we'd need to compute a distance
>> for every vector? I imagine that this would only make sense for low values
>> of the radius, so that few vectors would match, but this looks to me like
>> it would be hard to predict whether a given radius would actually match a
>> small set of vectors. Should the query still require a `k` value in
>> addition to the radius to make sure it doesn't go wild?
>>
>> On Tue, Nov 8, 2022 at 7:26 AM Alexey Gorlenko <agorlen...@gmail.com>
>> wrote:
>>
>>> Thanks, Michael!
>>> Yes, I will try.
>>>
>>> вт, 8 нояб. 2022 г. в 03:31, Michael Sokolov <msoko...@gmail.com>:
>>>
>>>> +1 to adding a scoring threshold. I think it could be another
>>>> parameter to KnnVectorQuery. Do you want to have a try at adding this?
>>>> If so, please feel free to open a PR and I will be happy to guide you.
>>>>
>>>> On Mon, Nov 7, 2022 at 6:38 AM Alexey Gorlenko <agorlen...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi!
>>>> >
>>>> > There are some use cases where we need to find vectors with the
>>>> distance (by some metric) to the given vector V less than the given
>>>> threshold T. That task is very similar to the knn problem, but in this case
>>>> we don't have a quantity of the nearest neighbours k.
>>>> >
>>>> > As I see, the current implementation of knn doesn't provide such
>>>> functionality. But at the first glance it is not very difficult to modify
>>>> the method search of HnswGraph to implement that feature (do not limit
>>>> result size and get rid of candidates which exceed threshold).
>>>> >
>>>> > But maybe that idea has some not obvious problems which I haven't
>>>> noticed, and in reality an implementation of that idea would have
>>>> fundamental difficulties?
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>
>>>>
>>
>> --
>> Adrien
>>
>

-- 
Adrien

Reply via email to