This is a pretty big hole in Lucene-based search right now that many practitioners have struggled with
I know a couple of people who have worked on solutions. And I've used a couple of hacks: - You can hack together something that does cosine similarity using the term frequency & query boosts DelimitedTermFreqFilterFactory. Basically the term frequency becomes a feature weight on the document. Boosts become the query weight. If you massage things correctly with the similarity, the resulting boolean similarity is a dot product... - Erik Hatcher has done some great work with payloads which you might want to check out. See the delimited payload filter factory, and payload score function queries - Simon Hughes Activate Talk (slides/video not yet posted) covers this topic in some depth - Rene Kriegler's Haystack Talk discusses encoding Inception model vectorizations of images: https://opensourceconnections.com/events/haystack-single/haystack-relevance-scoring/ If this is a huge importance to you, I might also suggest looking at vespa, which makes tensors a first-class citizen and makes matrix-math pretty seamless: http://vespa.ai Hope that helps -Doug On Fri, Oct 19, 2018 at 12:50 PM Ken Krugler <kkrugler_li...@transpac.com> wrote: > Hi all, > > [I posted on the Lucene list two days ago, but didn’t see any response - > checking here for completeness] > > I’ve been looking at directly storing feature vectors and providing > scoring/filtering support. > > This is for vectors consisting of (typically 300 - 2048) floats or doubles. > > It’s following the same pattern as geospatial support - so a new field > type and query/parser, plus plumbing to hook it into Solr. > > Before I go much further, is there anything like this already done, or in > the works? > > Thanks, > > — Ken > > -------------------------- > Ken Krugler > +1 530-210-6378 <(530)%20210-6378> > http://www.scaleunlimited.com > Custom big data solutions & training > Flink, Solr, Hadoop, Cascading & Cassandra > > -- CTO, OpenSource Connections Author, Relevant Search http://o19s.com/doug