If you need them for scoring, then the natural choice would be to
encode them in a BinaryDocValuesField. How do you plan to filter on
these filter vectors? This is too many dimensions for points and doc
values are not good at filtering.

On Thu, Oct 18, 2018 at 2:32 AM Ken Krugler <[email protected]> wrote:
>
> I’ve been looking at directly storing feature vectors and providing 
> scoring/filtering support.
>
> This is for vectors consisting of (typically 300 - 2048) floats or doubles.
>
> It’s following the same pattern as geospatial support - so a new field type 
> and query/parser, plus plumbing to hook it into Solr.
>
> Before I go much further, is there anything like this already done, or in the 
> works?
>
> Thanks,
>
> — Ken
>
>
> > On Feb 26, 2018, at 4:24 PM, Luís Filipe Nassif <[email protected]> wrote:
> >
> > Thank you, Adrian.
> >
> > Em 26 de fev de 2018 21:19, "Adrien Grand" <[email protected]> escreveu:
> >
> >> Yes it is.
> >>
> >> Le mar. 27 févr. 2018 à 00:03, Luís Filipe Nassif <[email protected]> a
> >> écrit :
> >>
> >>> Hi Lucene community,
> >>>
> >>> Is BinaryPoint limited up to 8 dimensions?
> >>>
> >>> Thanks,
> >>> Luis
> >>>
> >>> Em 6 de fev de 2018 16:07, "Luís Filipe Nassif" <[email protected]>
> >>> escreveu:
> >>>
> >>> Is it limited up to 8 dimensions as described at
> >>> https://www.elastic.co/blog/lucene-points-6.0?
> >>>
> >>> 2018-02-06 15:35 GMT-02:00 Luís Filipe Nassif <[email protected]>:
> >>>
> >>>> Sorry, I was looking at the wrong place. Should I use BinaryPoint (
> >>>> https://lucene.apache.org/core/6_0_0/core/org/apache/lucene
> >>>> /document/BinaryPoint.html) ?
> >>>>
> >>>> 2018-02-06 14:17 GMT-02:00 Luís Filipe Nassif <[email protected]>:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Lucene is able to index generic n-dimensional points for efficient
> >>>>> similarity or nearest neightbors search? I have looked at spatial
> >>> package
> >>>>> in the past but seems it is specific to geo points? The use case is to
> >>>>> index image feature vectors to search for similar images in a corpus.
> >>>>>
> >>>>> Currently we are using lucene to text search and we would like to not
> >>>>> have to manage two different index structures, synchronize commits, so
> >>> on.
> >>>>>
> >>>>> Thank you,
> >>>>> Luis Nassif
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> Custom big data solutions & training
> Flink, Solr, Hadoop, Cascading & Cassandra
>


-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to