Re: How to do ? Articles and Its Associated Comments Indexing , One to Many relationship
See below... On Thu, Aug 26, 2010 at 4:31 AM, Sumit Arora wrote: > Thanks Ephraim for your response. > > If I use MultiValued for Comments Field then While Picking data from Solr, > Should I use following Logic : > > /* Sample PseudoCode */ > > Get Rows from Article and Article-Comments Table ; *// It will retrieve - > 1 > Article and 20 Comments* > > Begin; > > Include 'Article Fields Value' in 'Solr Fields Value' Defined in Schema.Xml > */* One Article in this Case, So it will generate one document id for Solr > - */* > > Comments = 0; > > While (Comments ! = 20 ) > > { > Include this Comment; > > ++Comments; > } > > End; > > Result : One Article with MultipleComments as MultiValued indexed in Solr, > Finally Solr will have only one document or multiple document ? > > A multi-valued field is just what it says, a field within a single document. So you'd have one document with 20 values for your comment field. However, note that SOLR doesn't have partial updates of a document, it deletes and re-adds a document when you update. This is handled automatically for you if you have a uniquekey defined. That is, if you add a new document with the SAME unique key as a previous document, the previous one will be removed and the new one will replace it (with a new internal document id). > If I suppose to use HighLight Text in this case, and Search - Keyword exist > in more than one Comments ? How I can achieve below result where it has > found 'web' keyword exist in two comments. > > ... 1.The *web* portal will connect a lot of people for some specific > domain, and then people can post their interesting story, upload files > > ... 2.1 accessing multiple sites will slow down the user experience - try > not to do it. *web* hosting is not too expensive as compared to the other > components ... > > > I believe this is controlled by the hl.fragsize, see: http://wiki.apache.org/solr/HighlightingParameters#hl.fragsize The other thing you should be aware of is "increment gap". This is useful if you want, say, phrase queries to NOT work across two comments. I.e. comment 1: comments are very nice comment 2: day in and day out If you don't want a phrase query "nice day" to match the enclosing document, you probably want to work with the positionIncrementGap. See: http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html Best Erick > > > On Thu, Aug 26, 2010 at 4:32 PM, Ephraim Ofir wrote: > > > Why not define the comment field as multiValued? That way you only index > > each document once and you don't need to collapse anything... > > > > Ephraim Ofir > > > > > > -Original Message- > > From: Sumit Arora [mailto:sumit1...@gmail.com] > > Sent: Thursday, August 26, 2010 12:54 PM > > To: solr-user@lucene.apache.org > > Subject: How to do ? Articles and Its Associated Comments Indexing , One > > to Many relationship > > > > I have set of Articles and then Comments on it, so in database I have > > two > > major tables one for Articles and one for Comments, but each Article > > could > > have many comments (One to Many). > > > > > > If One Article will have 20 Comments, then on DB to SOLR - Index - Sync > > : > > Solr will index 20 Similar Documents with a difference of each Comment. > > > > > > Use Case : > > > > On Search: If keyword would be a fit to more than one comment, then it > > will > > return duplicate documents. > > > > > > One Possible solution I thought to Apply: > > > > ** > > > > I should go for Indexing 20 Similar Documents with a difference of each > > Comment. > > > > > > While retrieving results from Query: I could use: collapse.field = By > > Article Id > > > > > > Am I following right approach? > > >
Re: How to do ? Articles and Its Associated Comments Indexing , One to Many relationship
Thanks Ephraim for your response. If I use MultiValued for Comments Field then While Picking data from Solr, Should I use following Logic : /* Sample PseudoCode */ Get Rows from Article and Article-Comments Table ; *// It will retrieve - 1 Article and 20 Comments* Begin; Include 'Article Fields Value' in 'Solr Fields Value' Defined in Schema.Xml */* One Article in this Case, So it will generate one document id for Solr - */* Comments = 0; While (Comments ! = 20 ) { Include this Comment; ++Comments; } End; Result : One Article with MultipleComments as MultiValued indexed in Solr, Finally Solr will have only one document or multiple document ? If I suppose to use HighLight Text in this case, and Search - Keyword exist in more than one Comments ? How I can achieve below result where it has found 'web' keyword exist in two comments. ... 1.The *web* portal will connect a lot of people for some specific domain, and then people can post their interesting story, upload files ... 2.1 accessing multiple sites will slow down the user experience - try not to do it. *web* hosting is not too expensive as compared to the other components ... On Thu, Aug 26, 2010 at 4:32 PM, Ephraim Ofir wrote: > Why not define the comment field as multiValued? That way you only index > each document once and you don't need to collapse anything... > > Ephraim Ofir > > > -Original Message- > From: Sumit Arora [mailto:sumit1...@gmail.com] > Sent: Thursday, August 26, 2010 12:54 PM > To: solr-user@lucene.apache.org > Subject: How to do ? Articles and Its Associated Comments Indexing , One > to Many relationship > > I have set of Articles and then Comments on it, so in database I have > two > major tables one for Articles and one for Comments, but each Article > could > have many comments (One to Many). > > > If One Article will have 20 Comments, then on DB to SOLR - Index - Sync > : > Solr will index 20 Similar Documents with a difference of each Comment. > > > Use Case : > > On Search: If keyword would be a fit to more than one comment, then it > will > return duplicate documents. > > > One Possible solution I thought to Apply: > > ** > > I should go for Indexing 20 Similar Documents with a difference of each > Comment. > > > While retrieving results from Query: I could use: collapse.field = By > Article Id > > > Am I following right approach? >
RE: How to do ? Articles and Its Associated Comments Indexing , One to Many relationship
Why not define the comment field as multiValued? That way you only index each document once and you don't need to collapse anything... Ephraim Ofir -Original Message- From: Sumit Arora [mailto:sumit1...@gmail.com] Sent: Thursday, August 26, 2010 12:54 PM To: solr-user@lucene.apache.org Subject: How to do ? Articles and Its Associated Comments Indexing , One to Many relationship I have set of Articles and then Comments on it, so in database I have two major tables one for Articles and one for Comments, but each Article could have many comments (One to Many). If One Article will have 20 Comments, then on DB to SOLR - Index - Sync : Solr will index 20 Similar Documents with a difference of each Comment. Use Case : On Search: If keyword would be a fit to more than one comment, then it will return duplicate documents. One Possible solution I thought to Apply: ** I should go for Indexing 20 Similar Documents with a difference of each Comment. While retrieving results from Query: I could use: collapse.field = By Article Id Am I following right approach?
How to do ? Articles and Its Associated Comments Indexing , One to Many relationship
I have set of Articles and then Comments on it, so in database I have two major tables one for Articles and one for Comments, but each Article could have many comments (One to Many). If One Article will have 20 Comments, then on DB to SOLR - Index - Sync : Solr will index 20 Similar Documents with a difference of each Comment. Use Case : On Search: If keyword would be a fit to more than one comment, then it will return duplicate documents. One Possible solution I thought to Apply: ** I should go for Indexing 20 Similar Documents with a difference of each Comment. While retrieving results from Query: I could use: collapse.field = By Article Id Am I following right approach?