Re: indexing comments with Apache Solr

Umesh Prasad Wed, 06 Aug 2014 07:55:26 -0700

 griddynamics blog  is useful. It has 4 parts which covers block join quite
well ..


http://blog.griddynamics.com/2012/08/block-join-query-performs.html
http://blog.griddynamics.com/2013/09/solr-block-join-support.html
http://blog.griddynamics.com/2013/12/grandchildren-and-siblings-with-block.html
http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html

The github repo is https://gist.github.com/mkhludnev


On 6 August 2014 19:05, Ali Nazemian <alinazem...@gmail.com> wrote:

> Dear Alexandre,
> Hi,
> Thank you very much. I think nested document is what I need. Do you have
> more information about how can I define such thing in solr schema? Your
> mentioned blog post was all about retrieving nested docs.
> Best regards.
>
>
> On Wed, Aug 6, 2014 at 5:16 PM, Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
> > You can index comments as child records. The structure of the Solr
> > document should be able to incorporate both parents and children
> > fields and you need to index them all together. Then, just search for
> > JOIN syntax for nested documents. Also, latest Solr (4.9) has some
> > extra functionality that allows you to find all parent pages and then
> > expand children pages to match.
> >
> > E.g.: http://heliosearch.org/expand-block-join/ seems relevant
> >
> > Regards,
> >    Alex.
> > Personal: http://www.outerthoughts.com/ and @arafalov
> > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> >
> >
> > On Wed, Aug 6, 2014 at 11:18 AM, Ali Nazemian <alinazem...@gmail.com>
> > wrote:
> > > Dear Gora,
> > > I think you misunderstood my problem. Actually I used nutch for
> crawling
> > > websites and my problem is in index side and not crawl side. Suppose
> page
> > > is fetch and parsed by Nutch and all comments and the date and source
> of
> > > comments are identified by parsing. Now what can I do for indexing
> these
> > > comments? What is the document granularity?
> > > Best regards.
> > >
> > >
> > > On Wed, Aug 6, 2014 at 1:29 PM, Gora Mohanty <g...@mimirtech.com>
> wrote:
> > >
> > >> On 6 August 2014 14:13, Ali Nazemian <alinazem...@gmail.com> wrote:
> > >> >
> > >> > Dear all,
> > >> > Hi,
> > >> > I was wondering how can I mange to index comments in solr? suppose I
> > am
> > >> > going to index a web page that has a content of news and some
> comments
> > >> that
> > >> > are presented by people at the end of this page. How can I index
> these
> > >> > comments in solr? consider the fact that I am going to do some
> > analysis
> > >> on
> > >> > these comments. For example I want to have such query flexibility
> for
> > >> > retrieving all comments that are presented between 24 June 2014 to
> 24
> > >> July
> > >> > 2014! or all the comments that are presented by specific person.
> > >> Therefore
> > >> > defining these comment as multi-value field would not be the
> solution
> > >> since
> > >> > in this case such query flexibility is not feasible. So what is you
> > >> > suggestion about document granularity in this case? Can I consider
> > all of
> > >> > these comments as a new document inside main document (tree based
> > >> > structure). What is your suggestion for this case? I think it is a
> > common
> > >> > case of indexing webpages these days so probably I am not the only
> one
> > >> > thinking about this situation. Please share you though and perhaps
> > your
> > >> > experiences in this condition with me. Thank you very much.
> > >>
> > >> Parsing a web page, and breaking up parts up for indexing into
> different
> > >> fields
> > >> is out of the scope of Solr. You might want to look at Apache Nutch
> > which
> > >> can index into Solr, and/or other web crawlers/scrapers.
> > >>
> > >> Regards,
> > >> Gora
> > >>
> > >
> > >
> > >
> > > --
> > > A.Nazemian
> >
>
>
>
> --
> A.Nazemian
>



-- 
Thanks & Regards
Umesh Prasad
Search l...@flipkart.com

 in.linkedin.com/pub/umesh-prasad/6/5bb/580/

Re: indexing comments with Apache Solr

Reply via email to