thanks, I will try it gosia
On Fri, Jun 14, 2013 at 10:33 AM, Jack Krupansky <j...@basetechnology.com>wrote: > First, start with Solr and use the edismax query parser with the default > query operator as "OR" and set pf, pf2, and pf3, and then simply query by > the raw text of the paragraph. This will order the results by how closely > the indexed paragraphs match the query paragraph. > > This is also a good technique for detecting plagiarism where a lot of the > text is similar if not identical. > > Once you get experience using this technique in Solr, then simply look at > the parsed query that edismax generates and do the same in your Lucene Java > code. > > -- Jack Krupansky > > -----Original Message----- From: Malgorzata Urbanska > Sent: Friday, June 14, 2013 12:23 PM > To: java-user@lucene.apache.org > Subject: compare paragraphs of text - which Query Class to use? > > > Hello, > > I've just started using Lucene and I'm not sure which Query Classes I > should use in my project. > > My goal is to compare paragraphs of text. Paragraph A is a query and > paragraph B is a document for which I would like to calculate similarity > score. > > the paragraphs A and B can be in some situations exactly the same or not. > Generally I would like to check do they talk about the same topic. > > In my project I have set of paragraphs A and set of paragraphs B, so I'm > looking for some universal solution which allow me to check similarity > score for each paragraph A all paragraphs B. > > Do you have any suggestions? I really appreciate all of the ideas. > > -- > gosia > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> > For additional commands, e-mail: > java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org> > >