Re: Custom Query Implementation

2025-01-03 Thread Viacheslav Dobrynin
Hi, Thank you! пт, 3 янв. 2025 г. в 14:15, Uwe Schindler : > Hi, > > the expressions query should not be slower. Of course, if you also take > the compilation into the query time measurement it may be little slower > due to compilation and optimizing. In general queries should be warmed > before

Re: Custom Query Implementation

2025-01-03 Thread Uwe Schindler
Hi, the expressions query should not be slower. Of course, if you also take the compilation into the query time measurement it may be little slower due to compilation and optimizing. In general queries should be warmed before measuring them + expressions should only be compiled once and reuse

Re: Custom Query Implementation

2024-12-03 Thread Viacheslav Dobrynin
Hi, Thanks for the answers! Yes, my task is to store only non-zero values from a sparse vector of large dimension, where most of the elements are zero. вт, 3 дек. 2024 г. в 19:17, Mikhail Khludnev : > Thanks for clarification Michael! > > On Tue, Dec 3, 2024 at 1:56 PM Michael Sokolov wrote: >

Re: Custom Query Implementation

2024-12-03 Thread Mikhail Khludnev
Thanks for clarification Michael! On Tue, Dec 3, 2024 at 1:56 PM Michael Sokolov wrote: > Sparse is meaning two different things here. In the case you found Mikhail, > it means not every document has a value for some vector field. I think the > question here is about very high dimensional vector

Re: Custom Query Implementation

2024-12-03 Thread Michael Sokolov
Sparse is meaning two different things here. In the case you found Mikhail, it means not every document has a value for some vector field. I think the question here is about very high dimensional vectors where most documents have zeroes in most dimensions of the vector. On Tue, Dec 3, 2024, 2:01 A

Re: Custom Query Implementation

2024-12-02 Thread Mikhail Khludnev
Morning. I noticed a condition choosing sparse and dense format underneath https://github.com/apache/lucene/blob/6053e1e31378378f6d310a05ea6d7dcdfc45f48b/lucene/core/src/java/org/apache/lucene/codecs/lucene95/OffHeapByteVectorValues.java#L108 perhaps it may achieve your performance requirements.

Re: Custom Query Implementation

2024-12-02 Thread Viacheslav Dobrynin
Hi, Thanks for the answer! I think this is similar to my initial implementation, where I built the query as follows (PyLucene): def build_query(query): builder = BooleanQuery.Builder() for term in torch.nonzero(query): field_name = to_field_name(term.item()) value = query[

Re: Custom Query Implementation

2024-12-02 Thread Michael Sokolov
Another way is using postings - you can represent each dimension as a term (`dim0`, `dim1`, etc) and index those that occur in a document. To encode a value for a dimension you can either provide a custom term frequency, or index the term multiple times. Then when searching you can form a BooleanQu

Re: Custom Query Implementation

2024-12-02 Thread Viacheslav Dobrynin
Hi, Thanks for the reply. I haven't tried to do that. However, I do not fully understand how in this case an inverted index will be constructed for an efficient search by terms (O(1) for each term as a key )? пн, 2 дек. 2024 г. в 21:55, Patrick Zhai : > Hi, have you tried to encode the sparse v

Re: Custom Query Implementation

2024-12-02 Thread Patrick Zhai
Hi, have you tried to encode the sparse vector yourself using the BinaryDocValueField? One way I can think of is to encode it as (size, index_array, value_array) per doc Intuitively I feel like this should be more efficient than one dimension per field if your dimension is high enough Patrick On

Re: Custom Query Implementation

2024-12-02 Thread Viacheslav Dobrynin
Hi! I need to index sparse vectors, whereas as I understand it, KnnFloatVectorField is designed for dense vectors. Therefore, it seems that this approach will not work. вс, 1 дек. 2024 г. в 18:36, Mikhail Khludnev : > Hi, > May it look like KnnFloatVectorField(... DOT_PRODUCT) > and KnnFloatVect

Re: Custom Query Implementation

2024-12-01 Thread Mikhail Khludnev
Hi, May it look like KnnFloatVectorField(... DOT_PRODUCT) and KnnFloatVectorQuery?

Re: Custom Query Implementation

2024-12-01 Thread Viacheslav Dobrynin
Hi! Thank you for your reply! I tried the recommendations, and below I gave an example code for implementing queries. The query with the expression works a little slower, I think this is due to the need for compilation. I have one more question, please tell me which type of field is best suited f

Re: Custom Query Implementation

2024-11-30 Thread Mikhail Khludnev
Hi, Can't it be better done with FunctionQuery and proper ValueSources? Please also check Lucene Expressions? On Sat, Nov 30, 2024 at 9:00 PM Viacheslav Dobrynin wrote: > Hello! > > I have implemented a custom scoring mechanism. It looks like a dot product. > I would like to ask you how accurate

Custom Query Implementation

2024-11-30 Thread Viacheslav Dobrynin
Hello! I have implemented a custom scoring mechanism. It looks like a dot product. I would like to ask you how accurate and effective my implementation is, could you give me recommendations on how to improve it? Here are a couple of examples that I want to use this mechanism with. Example 1: A do