you were right
thanks for help
dziadgba
2007/3/11, Doron Cohen <[EMAIL PROTECTED]>:
Is "Text" the only field in the index?
Note that the search only looks at field "Text", while the terms()
iteration as appears in that code might bump into a term with same text
but
in another field. A better comparison would be to create a Term
("Text",<your-word>), and compare TermQuery(thatTerm) to
termDocs(thatTerm). Btw, if iterating all terms is ever a must, note
TermEnum.skipTo(Term).
Hope this helps,
Doron
dziadgba <[EMAIL PROTECTED]> wrote on 10/03/2007 05:33:03:
> hye,
> I want to extract documents which contain a specific term.
> I tried to do it in two different ways:
>
> 1 Using the 'iterator' termdocs = reader.termDocs(term);
> 2 Using search and examing Hits
>
> turns out that the result are sometimes equal, sometimes the first is a
> subset of the
> second and sometimes there is no connection between the two results.
>
> can somebody give me a hint?
>
> bye
>
>
> public void addDocumentsToTerm(int debug, String myterm) throws
> Exception
> { TermEnum terms=reader.terms();
>
> MyDocument doc;
> Document docLucene;
> int count=0;
> boolean b=false;
> if(debug==1) System.out.print("Checking (MyTerm) "+myterm+" ...
");
>
> while (terms.next())
> { Term t = terms.term();
> if(t.text().compareTo(myterm)==0)
> { TermDocs termDocs = reader.termDocs(t);
> while(termDocs.next())
> {
> if(debug==1 && count==0) System.out.println("equal to
(Term)
> "+t.text());
> count++;
> docLucene = reader.document(termDocs.doc());
> if(debug==1) System.out.println(" docLucene:
> ["+termDocs.doc()+"-
>
"+docLucene.getField("Code").stringValue()+"]
> ");
> b=true;
> }
> System.out.println();
>
> QueryParser pars = new QueryParser("Text",new
> StandardAnalyzer());
> Query q= pars.parse(t.text());
> Hits hits = searcher.search(q);
>
> System.out.println("Found "+hits.length()+" matches for
query
> "+q);
> for(int i=0;i<hits.length();i++)
> { Document d = hits.doc(i);
> System.out.println(" doc:
> ["+d.+"-"+d.getField("Code").stringValue()+"]");
>
> }
>
> System.out.println();
> if(b==false)
> System.out.println("No Document found for term:
> "+myterm.getTerm());
> if(debug==1)if(count < myterm.getDocFreq()&&count>0)
> System.out.println(" Error term: "+myterm.getTerm()+", documents
> found: "
> +count+",
> docFreq: "+myterm.getDocFreq());
>
> return;
> }
> }
> }
>
> OUPUT
>
> Checking (Myterm) zucca ... equal to (Term) zucca
> docLucene: [9963-356 U.S. 256]
>
> Found 8 matches for query Text:zucca
> doc: [0-356 U.S. 256]
> doc: [1-365 U.S. 290]
> doc: [2-351 U.S. 91]
> doc: [3-356 U.S. 660]
> doc: [4-365 U.S. 265]
> doc: [5-377 U.S. 235]
> doc: [6-435 U.S. 519]
> doc: [7-441 U.S. 281]
>
> Error term: zucca, documents found: 1, docFreq: 8
>
> Checking (Myterm) zimroth ... equal to (Term) zimroth
> doc: [16478-476 U.S. 467]
> doc: [17142-492 U.S. 257]
> doc: [17208-488 U.S. 235]
> doc: [17911-484 U.S. 1]
> doc: [17920-487 U.S. 1]
> doc: [18010-484 U.S. 301]
>
> Found 8 matches for query Text:zimroth
> doc: [0-389 U.S. 143]
> doc: [1-468 U.S. 981]
> doc: [2-489 U.S. 688]
> doc: [3-491 U.S. 781]
> doc: [4-436 U.S. 412]
> doc: [5-445 U.S. 573]
> doc: [6-462 U.S. 213]
> doc: [7-468 U.S. 897]
>
> Error term: zimroth, documents found: 6, docFreq: 8
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]