Dear Alexandre,
Hi,
Thank you very much. I think nested document is what I need. Do you have
more information about how can I define such thing in solr schema? Your
mentioned blog post was all about retrieving nested docs.
Best regards.


On Wed, Aug 6, 2014 at 5:16 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> You can index comments as child records. The structure of the Solr
> document should be able to incorporate both parents and children
> fields and you need to index them all together. Then, just search for
> JOIN syntax for nested documents. Also, latest Solr (4.9) has some
> extra functionality that allows you to find all parent pages and then
> expand children pages to match.
>
> E.g.: http://heliosearch.org/expand-block-join/ seems relevant
>
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On Wed, Aug 6, 2014 at 11:18 AM, Ali Nazemian <alinazem...@gmail.com>
> wrote:
> > Dear Gora,
> > I think you misunderstood my problem. Actually I used nutch for crawling
> > websites and my problem is in index side and not crawl side. Suppose page
> > is fetch and parsed by Nutch and all comments and the date and source of
> > comments are identified by parsing. Now what can I do for indexing these
> > comments? What is the document granularity?
> > Best regards.
> >
> >
> > On Wed, Aug 6, 2014 at 1:29 PM, Gora Mohanty <g...@mimirtech.com> wrote:
> >
> >> On 6 August 2014 14:13, Ali Nazemian <alinazem...@gmail.com> wrote:
> >> >
> >> > Dear all,
> >> > Hi,
> >> > I was wondering how can I mange to index comments in solr? suppose I
> am
> >> > going to index a web page that has a content of news and some comments
> >> that
> >> > are presented by people at the end of this page. How can I index these
> >> > comments in solr? consider the fact that I am going to do some
> analysis
> >> on
> >> > these comments. For example I want to have such query flexibility for
> >> > retrieving all comments that are presented between 24 June 2014 to 24
> >> July
> >> > 2014! or all the comments that are presented by specific person.
> >> Therefore
> >> > defining these comment as multi-value field would not be the solution
> >> since
> >> > in this case such query flexibility is not feasible. So what is you
> >> > suggestion about document granularity in this case? Can I consider
> all of
> >> > these comments as a new document inside main document (tree based
> >> > structure). What is your suggestion for this case? I think it is a
> common
> >> > case of indexing webpages these days so probably I am not the only one
> >> > thinking about this situation. Please share you though and perhaps
> your
> >> > experiences in this condition with me. Thank you very much.
> >>
> >> Parsing a web page, and breaking up parts up for indexing into different
> >> fields
> >> is out of the scope of Solr. You might want to look at Apache Nutch
> which
> >> can index into Solr, and/or other web crawlers/scrapers.
> >>
> >> Regards,
> >> Gora
> >>
> >
> >
> >
> > --
> > A.Nazemian
>



-- 
A.Nazemian

Reply via email to