
Thanks for the guidance! This indeed sounds challenging, esp. given the
bonus of fighting with solr 3.x in light of disjunction queries. Although,
moving to solr 4.0 if this makes life easier should be ok.

But even before getting one's hands dirty, it would be good to know, if
this is going to fly performance wise. Has your span based implementation
been fast enough? Did it stand close to the native solr's faceting in terms
of performance?

On Mon, Jan 21, 2013 at 2:33 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Dmitry,
> First of all, FacetComponent is the Solr's out-of-the-box functionality. It
> runs after search is done and accesses the bitSet of the found document,
> i.e. there is no spans (matched terms positions) there at all.
> StandardFacetsAccumulator sounds like the "brand new" lucene faceting
> library. see http://shaierera.blogspot.com/. I don't think but don't
> exactly know whether they are accessible there too.
> Some time ago my team successfully prototyped facet component backed on
> spans
> blog.griddynamics.com/2011/10/solr-experience-search-parent-child.htmlbut
> I don't suggest you go this way.
> I can suggest you start from the following:
> - supply PostFilter/DelegatingCollector
> http://yonik.com/posts/advanced-filter-caching-in-solr/
> - the DelegatingCollector will accept the scorer instance
> - if this scorer is BooleanScorer2 (but not BooleanScorer!), you can access
> the SpanQueryScorer in one of the legs and try to access the matched spans
> - if you are in 3.x you'll have a problem with disjunction queries.
> it seems challenging, doesn't it?
> 18.01.2013 17:40 пользователь "Dmitry Kan" <solrexp...@gmail.com> написал:
> > Mikhail,
> >
> > Do you say, that it is not possible to access the matched terms positions
> > in the FacetComponent? If that would be possible (somewhere in the
> > StandardFacetsAccumulator class, where docids are available), then by
> > knowing the matched term positions I can do some school simple math to
> > calculate the sentence counts per doc id.
> >
> > Dmitry
> >
> > On Fri, Jan 18, 2013 at 2:45 PM, Mikhail Khludnev <
> > mkhlud...@griddynamics.com> wrote:
> >
> > > Dmitry,
> > >
> > > It definitely seems like postptocessing highlighter's output. The also
> > > approach is:
> > > - limit number of occurrences of a word in a sentence to 1
> > > - play with facet by function patch
> > > https://issues.apache.org/jira/browse/SOLR-1581 accomplished by tf()
> > > function.
> > >
> > > It doesn't seem like much help.
> > >
> > > On Fri, Jan 18, 2013 at 12:42 PM, Dmitry Kan <solrexp...@gmail.com>
> > wrote:
> > >
> > > > that we actually require the count of the sentences inside
> > > > each document where the hits were found.
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > > Principal Engineer,
> > > Grid Dynamics
> > >
> > > <http://www.griddynamics.com>
> > >  <mkhlud...@griddynamics.com>
> > >
> >

Reply via email to