Hello lan,

code is below  ,please let us know where we are mistaken.


----------------------------
public  void FindSimilarDocExe()
 throws IOException, ParseException, ClassNotFoundException {

// ResultSet rs=null;
// Specify the analyzer for tokenizing text.
 // The same analyzer should be used for indexing and searching

// return array list of ranked result
 // ArrayList<ScoredDocumentBean> retval = new
// ArrayList<ScoredDocumentBean>();

 StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41);

// 1. create the index
 Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41,
 analyzer);

IndexWriter w = new IndexWriter(index, config);
int counter = 0;

// define array of field name which s active
int arrIndex = 0;

String queryStr = "Airgas is the United States largest distributor of
industrial, "
 + "medical, and specialty gases and related equipment, "
+ "safety supplies and MRO products and services to industrial and
commercial markets. "
 + "The client had the need to build a production planning module in order
to have one "
+ "comprehensive system for specialty gas production and to consolidate the
other multiple "
 + "production related systems.";

Document doc1 = new Document();

 doc1.add(new TextField("DocID", "1", Field.Store.YES));
doc1.add(new TextField(
 "discription",
"Airgas is the United States largest distributor of industrial, "
 + "medical, and specialty gases and related equipment, "
+ "safety supplies and MRO products and services to industrial and
commercial markets. "
 + "The client had the need to build a production planning module in order
to have one "
+ "comprehensive system for specialty gas production and to consolidate the
other multiple "
 + "production related systems.", Field.Store.YES));

w.addDocument(doc1);

doc1 = new Document();

doc1.add(new TextField("DocID", "2", Field.Store.YES));
 doc1.add(new TextField(
"discription",
"Form based document retrieval addresses the exact "
 + "syntactic properties of a text, comparable to substring matching in
string searches. "
+ "The text is generally unstructured and not necessarily in a natural
language, "
 + "the system could for example be used to process large sets of chemical
representations "
+ "in molecular biology. A suffix tree algorithm is an example for "
 + "form based indexing.", Field.Store.YES));

w.addDocument(doc1);

 doc1.add(new TextField("DocID", "3", Field.Store.YES));
doc1.add(new TextField(
 "discription",
"A signature file is a technique that creates a quick "
+ "and dirty filter, for example a Bloom filter, that will keep all the
documents "
 + "that match to the query and hopefully a few ones that do not. The way
this is done is "
+ "by creating for each file a signature, typically a hash coded version.
One method is "
 + "superimposed coding. A post-processing step is done to discard the
false alarms. Since in most "
+ "cases this structure is inferior to inverted files in terms of speed,
size and functionality, it "
 + "is not used widely. However, with proper parameters it can beat the
inverted files in certain "
+ "environments.", Field.Store.YES));

w.addDocument(doc1);

doc1.add(new TextField("DocID", "4", Field.Store.YES));
 doc1.add(new TextField(
"discription",
"Novell is the leading global provider of security information "
 + "management and compliance monitoring solutions. This module includes
development of components "
+ "called Collectors. Collectors are used to collect and normalize events
from security devices "
 + "and programs. These normalized events are then sent to sentinel for use
in correlation, "
+ "reporting, and incident response. Different types of collectors are
developed to read "
 + "events/logs from different types of network devices using different
types of connection "
+ "methods.  Documentation is also provided with each collector which
describes how to configure "
 + "the collector and the corresponding network device/product. Reports are
also developed "
+ "specific to each collector which can be used to analyze the data pumped
by a collector. "
 + "And correlation rules are also shipped with each collector to correlate
the events pumped by "
+ "a collector.", Field.Store.YES));

w.addDocument(doc1);

doc1.add(new TextField("DocID", "5", Field.Store.YES));
 doc1.add(new TextField(
"discription",
"Airgas is the United States largest distributor of industrial, "
 + "medical, and specialty gases and related equipment, "
+ "safety supplies and MRO products and services to industrial and
commercial markets. "
 + "The client had the need to build a production planning module in order
to have one "
+ "comprehensive system for specialty gas production and to consolidate the
other multiple "
 + "production related systems.", Field.Store.YES));

w.addDocument(doc1);

w.close();

// here add weight in query

// here write file write code

int hitsperpage = 10;
IndexReader reader = IndexReader.open(index);

IndexSearcher searcher = new IndexSearcher(reader);

// get similar doc
// Reader sReader = new StringReader(query);
MoreLikeThis mlt = new MoreLikeThis(reader);
 // ------------------------
mlt.setAnalyzer(analyzer);
String[] fieldName = new String[1];
 fieldName[0] = "discription";

mlt.setFieldNames(fieldName);

 Reader reader1 = new StringReader(queryStr);
Query Searchedquery = mlt.like(reader1, null);

 // ------------------

TopDocs results = searcher.search(Searchedquery, 10);

 ScoreDoc[] hits = results.scoreDocs;

for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc; //
 // if(hits[i].score<.4) // { // continue; // }
Document d = searcher.doc(docId);

 System.out
.println("similar/more like document to your query with Docid "
+ d.get("docid") + "is:" + d.get("Discription"));

}

}



On Mon, Jul 21, 2014 at 5:27 PM, Ian Lea <ian....@gmail.com> wrote:

> That's not completely self-contained which means that we can't compile
> and test it ourselves.
>
> This looks very dodgy:
>
> doc.add(new TextField("searchedText", ...
>
> mlt.setFieldNames("SearchedText");
>
>
> --
> Ian.
>
>
> On Mon, Jul 21, 2014 at 12:41 PM, Rajendra Rao
> <rajendra....@launchship.com> wrote:
> > Hello Ian,
> >
> > I am using version 4.1
> >
> >
> > I am expecting id of SearchedText which is similar of   QueryStr
> > but i am getting   0 size    ScoreDoc[] hits
> >
> > Below is code I am using.
> >
> > ___________________________
> >
> > My Parameter is
> > ArrayList whic contain DocID  and Criteria Text  (collection of Documents
> > to be passed for indexing)
> > Query  text  in QueryStr  .
> >
> >
> > StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41);
> >
> > // 1. create the index
> >  Directory index = new RAMDirectory();
> > IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41,
> >  analyzer);
> >
> > IndexWriter w = new IndexWriter(index, config);
> > int counter = 0;
> >
> >  while (counter < lstDocBean.size()) {
> >
> > String searchedText=lstDocBean.getText();
> >                           String docID=lstDocBean.getDocID();
> >
> >
> > Document doc = new Document();
> >  doc.add(new TextField("DocID", docID, shouldStore.YES));
> >                        doc.add(new TextField("searchedText",
> searchedText,
> > shouldStore.YES));
> >
> >
> >
> > w.addDocument(doc);
> >
> > counter++;
> >
> > }
> >
> >
> >
> > w.close();
> >
> >
> > int hitsperpage = 10;
> > IndexReader reader = IndexReader.open(index);
> >
> >  IndexSearcher searcher = new IndexSearcher(reader);
> >
> > // get similar doc
> > // Reader sReader = new StringReader();
> >  MoreLikeThis mlt = new MoreLikeThis(reader);
> > // ------------------------
> > mlt.setAnalyzer(analyzer);
> >
> > mlt.setFieldNames("SearchedText");
> >
> > Reader reader1 = new StringReader(queryStr);
> >  Query Searchedquery = mlt.like(reader1, null);
> >
> >
> >
> > // ------------------
> >
> > TopDocs results = searcher.search(Searchedquery, 10);
> >  ScoreDoc[] hits = results.scoreDocs;
> >
> >
> > for (int i = 0; i < hits.length; ++i) {
> > int docId = hits[i].doc; //
> >
> >  Document d = searcher.doc(docId);
> >  int sys_DocID=d.get("DocID");
> >                                double Score           = hits[i].score
> >
> >
> >
> > }
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Jul 18, 2014 at 7:34 PM, Ian Lea <ian....@gmail.com> wrote:
> >
> >> You need to supply more info.  Tell us what version of lucene you are
> >> using and provide a very small completely self-contained example or
> >> test case showing exactly what you expect to happen and what is
> >> happening instead.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Fri, Jul 18, 2014 at 11:50 AM, Rajendra Rao
> >> <rajendra....@launchship.com> wrote:
> >> > Hello
> >> >
> >> >
> >> > I am using more like this query .But size of Score Docs i am getting
> is 0
> >> > I found that it
> >> > In Query Searchedquery = mlt.like(reader1, "criteria");
> >> >
> >> > query object contain following value
> >> > boost 1.0
> >> > all clauses element data is null
> >> >
> >> >
> >> > I used following code
> >> > MoreLikeThis mlt = new MoreLikeThis(reader);
> >> >  // ------------------------
> >> > mlt.setAnalyzer(analyzer);
> >> >
> >> >  Reader reader1 = new StringReader(Requirement);
> >> > Query Searchedquery = mlt.like(reader1, "criteria");
> >> >
> >> >
> >> > please guide me.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to