Hi Mikhail, I was merely posing a thought in an effort to continue to learn and educate myself. Your point about Weight.scorer() being called per segment helps my understanding. I am in the middle of building a POC for a customer of mine that I pointed out in this thread on Dec 5th (shortly after noon). I have spent countless hours over the weekend continuing to try and learn the internals of SOLR and Lucene.
Thanks Darin > On Dec 8, 2014, at 4:57 AM, Mikhail Khludnev <mkhlud...@griddynamics.com> > wrote: > > On Fri, Dec 5, 2014 at 10:44 PM, Darin Amos <dari...@gmail.com> wrote: > >> public Scorer scorer(){ >> TermsWithScoreCollector collector = new >> TermsWithScoreCollector(); >> JoinQuery.this.s.search(JoinQuery.this.q, >> collector); >> >> //do the rest.. >> >> } >> > > Darin, > I hardly follow, but this approach either is not efficient or even doesn't > work. Generally join is O(n^2) operation, which is most impls try to > reduce. weight.scorer() is invoked per segment, and scorer yields results > only from a particular segment. However, fromQuery should run across all > segments. Hence, TermsWithScoreCollector will collect IDs globally again > and again. > As you can see, the current JoinUtil design is much more efficient, it > reuses global IDs hash across all "to" segments searches. > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com>