Re: Doubt Regarding Lucene matchscore
+Priyanka Tufchi Hello Kumaran We are using version 4.1 we have passed 1900 document and set paging 1900 we are expecting score of 1900 but we are getting only 1400 records in ScoreDoc[] hits So where are remaining 500 records .? should we consider it no matched ? and 900 records have matchIndex 0 what should we consider it? No Match. Hope you will get it more clear now. sample code is below - String QueryStr = query; // return array list of ranked result StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, analyzer); IndexWriter w = new IndexWriter(index, config); int counter = 0; // define array of field name which s active int arrIndex = 0; ///add near 1900 document using while while (counter lstDocBean.size()) { String criteria=lstDocBean.getCriteria(); String docID=lstDocBean.getDocID(); Document doc = new Document(); doc.add(new TextField(DocID, docID, shouldStore.YES)); doc.add(new TextField(Criteria, criteria, shouldStore.YES)); w.addDocument(doc); counter++; } w.close(); // here add weight in query // here write file write code // Query queryparser = new QueryParser(Version.LUCENE_41, , // analyzer).parse(QueryParser.escape(QueryStr)); Query queryparser = new MultiFieldQueryParser(Version.LUCENE_41, Criteria, analyzer).parse(QueryParser.escape(QueryStr)); int hitsperpage = 1900; IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); TopScoreDocCollector collector = TopScoreDocCollector.create( hitsperpage, true); searcher.search(queryparser, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; // refineLuceneTextSearch(Requirement, hits, searcher, ReqId); // old code //Display for (int i = 0; i hits.length; ++i) { int docId = hits[i].doc; // Document d = searcher.doc(docId); String name1 = d.get(DocID + ); system.out.println(DocID+) } On Fri, Jul 18, 2014 at 11:45 PM, Kumaran R kums@gmail.com wrote: Provide some more information like lucene version, sample code, parameters involved in indexing and searching. -- Kumaran R On 18-Jul-2014, at 6:52 pm, Priyanka Tufchi priyanka.tuf...@launchship.com wrote: Hi All I am matching and ranking two set of Docs using apache lucene and I passes page hits 1000. But in the result it shows 200 only why? It means that rest 800 are not matched and if so then what we should consider if we are getting 0.00 score for any match . Waiting for reply Thanks Priyanka - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: More Like This query is not working.
Hello Ian, I am using version 4.1 I am expecting id of SearchedText which is similar of QueryStr but i am getting 0 sizeScoreDoc[] hits Below is code I am using. ___ My Parameter is ArrayList whic contain DocID and Criteria Text (collection of Documents to be passed for indexing) Query text in QueryStr . StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, analyzer); IndexWriter w = new IndexWriter(index, config); int counter = 0; while (counter lstDocBean.size()) { String searchedText=lstDocBean.getText(); String docID=lstDocBean.getDocID(); Document doc = new Document(); doc.add(new TextField(DocID, docID, shouldStore.YES)); doc.add(new TextField(searchedText, searchedText, shouldStore.YES)); w.addDocument(doc); counter++; } w.close(); int hitsperpage = 10; IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); // get similar doc // Reader sReader = new StringReader(); MoreLikeThis mlt = new MoreLikeThis(reader); // mlt.setAnalyzer(analyzer); mlt.setFieldNames(SearchedText); Reader reader1 = new StringReader(queryStr); Query Searchedquery = mlt.like(reader1, null); // -- TopDocs results = searcher.search(Searchedquery, 10); ScoreDoc[] hits = results.scoreDocs; for (int i = 0; i hits.length; ++i) { int docId = hits[i].doc; // Document d = searcher.doc(docId); int sys_DocID=d.get(DocID); double Score = hits[i].score } On Fri, Jul 18, 2014 at 7:34 PM, Ian Lea ian@gmail.com wrote: You need to supply more info. Tell us what version of lucene you are using and provide a very small completely self-contained example or test case showing exactly what you expect to happen and what is happening instead. -- Ian. On Fri, Jul 18, 2014 at 11:50 AM, Rajendra Rao rajendra@launchship.com wrote: Hello I am using more like this query .But size of Score Docs i am getting is 0 I found that it In Query Searchedquery = mlt.like(reader1, criteria); query object contain following value boost 1.0 all clauses element data is null I used following code MoreLikeThis mlt = new MoreLikeThis(reader); // mlt.setAnalyzer(analyzer); Reader reader1 = new StringReader(Requirement); Query Searchedquery = mlt.like(reader1, criteria); please guide me. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: More Like This query is not working.
That's not completely self-contained which means that we can't compile and test it ourselves. This looks very dodgy: doc.add(new TextField(searchedText, ... mlt.setFieldNames(SearchedText); -- Ian. On Mon, Jul 21, 2014 at 12:41 PM, Rajendra Rao rajendra@launchship.com wrote: Hello Ian, I am using version 4.1 I am expecting id of SearchedText which is similar of QueryStr but i am getting 0 sizeScoreDoc[] hits Below is code I am using. ___ My Parameter is ArrayList whic contain DocID and Criteria Text (collection of Documents to be passed for indexing) Query text in QueryStr . StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, analyzer); IndexWriter w = new IndexWriter(index, config); int counter = 0; while (counter lstDocBean.size()) { String searchedText=lstDocBean.getText(); String docID=lstDocBean.getDocID(); Document doc = new Document(); doc.add(new TextField(DocID, docID, shouldStore.YES)); doc.add(new TextField(searchedText, searchedText, shouldStore.YES)); w.addDocument(doc); counter++; } w.close(); int hitsperpage = 10; IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); // get similar doc // Reader sReader = new StringReader(); MoreLikeThis mlt = new MoreLikeThis(reader); // mlt.setAnalyzer(analyzer); mlt.setFieldNames(SearchedText); Reader reader1 = new StringReader(queryStr); Query Searchedquery = mlt.like(reader1, null); // -- TopDocs results = searcher.search(Searchedquery, 10); ScoreDoc[] hits = results.scoreDocs; for (int i = 0; i hits.length; ++i) { int docId = hits[i].doc; // Document d = searcher.doc(docId); int sys_DocID=d.get(DocID); double Score = hits[i].score } On Fri, Jul 18, 2014 at 7:34 PM, Ian Lea ian@gmail.com wrote: You need to supply more info. Tell us what version of lucene you are using and provide a very small completely self-contained example or test case showing exactly what you expect to happen and what is happening instead. -- Ian. On Fri, Jul 18, 2014 at 11:50 AM, Rajendra Rao rajendra@launchship.com wrote: Hello I am using more like this query .But size of Score Docs i am getting is 0 I found that it In Query Searchedquery = mlt.like(reader1, criteria); query object contain following value boost 1.0 all clauses element data is null I used following code MoreLikeThis mlt = new MoreLikeThis(reader); // mlt.setAnalyzer(analyzer); Reader reader1 = new StringReader(Requirement); Query Searchedquery = mlt.like(reader1, criteria); please guide me. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: More Like This query is not working.
Hello lan, code is below ,please let us know where we are mistaken. public void FindSimilarDocExe() throws IOException, ParseException, ClassNotFoundException { // ResultSet rs=null; // Specify the analyzer for tokenizing text. // The same analyzer should be used for indexing and searching // return array list of ranked result // ArrayListScoredDocumentBean retval = new // ArrayListScoredDocumentBean(); StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, analyzer); IndexWriter w = new IndexWriter(index, config); int counter = 0; // define array of field name which s active int arrIndex = 0; String queryStr = Airgas is the United States largest distributor of industrial, + medical, and specialty gases and related equipment, + safety supplies and MRO products and services to industrial and commercial markets. + The client had the need to build a production planning module in order to have one + comprehensive system for specialty gas production and to consolidate the other multiple + production related systems.; Document doc1 = new Document(); doc1.add(new TextField(DocID, 1, Field.Store.YES)); doc1.add(new TextField( discription, Airgas is the United States largest distributor of industrial, + medical, and specialty gases and related equipment, + safety supplies and MRO products and services to industrial and commercial markets. + The client had the need to build a production planning module in order to have one + comprehensive system for specialty gas production and to consolidate the other multiple + production related systems., Field.Store.YES)); w.addDocument(doc1); doc1 = new Document(); doc1.add(new TextField(DocID, 2, Field.Store.YES)); doc1.add(new TextField( discription, Form based document retrieval addresses the exact + syntactic properties of a text, comparable to substring matching in string searches. + The text is generally unstructured and not necessarily in a natural language, + the system could for example be used to process large sets of chemical representations + in molecular biology. A suffix tree algorithm is an example for + form based indexing., Field.Store.YES)); w.addDocument(doc1); doc1.add(new TextField(DocID, 3, Field.Store.YES)); doc1.add(new TextField( discription, A signature file is a technique that creates a quick + and dirty filter, for example a Bloom filter, that will keep all the documents + that match to the query and hopefully a few ones that do not. The way this is done is + by creating for each file a signature, typically a hash coded version. One method is + superimposed coding. A post-processing step is done to discard the false alarms. Since in most + cases this structure is inferior to inverted files in terms of speed, size and functionality, it + is not used widely. However, with proper parameters it can beat the inverted files in certain + environments., Field.Store.YES)); w.addDocument(doc1); doc1.add(new TextField(DocID, 4, Field.Store.YES)); doc1.add(new TextField( discription, Novell is the leading global provider of security information + management and compliance monitoring solutions. This module includes development of components + called Collectors. Collectors are used to collect and normalize events from security devices + and programs. These normalized events are then sent to sentinel for use in correlation, + reporting, and incident response. Different types of collectors are developed to read + events/logs from different types of network devices using different types of connection + methods. Documentation is also provided with each collector which describes how to configure + the collector and the corresponding network device/product. Reports are also developed + specific to each collector which can be used to analyze the data pumped by a collector. + And correlation rules are also shipped with each collector to correlate the events pumped by + a collector., Field.Store.YES)); w.addDocument(doc1); doc1.add(new TextField(DocID, 5, Field.Store.YES)); doc1.add(new TextField( discription, Airgas is the United States largest distributor of industrial, + medical, and specialty gases and related equipment, + safety supplies and MRO products and services to industrial and commercial markets. + The client had the need to build a production planning module in order to have one + comprehensive system for specialty gas production and to consolidate the other multiple + production related systems., Field.Store.YES)); w.addDocument(doc1); w.close(); // here add weight in query // here write file write code int hitsperpage = 10; IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); // get similar doc // Reader sReader = new StringReader(query);