To begin with, thanks for Yours quick response.
I am still trying write this code and I've already written this one:
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.store.*;
import org.apache.lucene.index.*;
import org.apache.lucene.document.*;
import org.apache.lucene.search.*;
import org.apache.lucene.search.regex.*;
public class TmpMain {
public static void main(String[] args) throws Exception {
RAMDirectory directory = new RAMDirectory();
IndexWriter writer = new IndexWriter(directory, new
StandardAnalyzer(),
true);
Document doc = new Document();
doc.add(new Field("field", "Cat Cot Cit Cet", Field.Store.YES,
Field.Index.TOKENIZED));
writer.addDocument(doc);
writer.optimize();
writer.close();
IndexReader reader=IndexReader.open(directory);
RegexTermEnum regexTermEnum=new RegexTermEnum(reader, new
Term("field",
"C.t"), new JavaUtilRegexCapabilities());
/*
//Second Version:
WildcardTermEnum regexTermEnum=new WildcardTermEnum(reader, new
Term("field", "C?t"));
*/
while(regexTermEnum.next()) {
System.out.println("Next:");
System.out.println(regexTermEnum.term().text());
}
System.out.println("End.");
}
}
But the output of this code is:
End.
As you can see above I have tried change line:
RegexTermEnum regexTermEnum=new RegexTermEnum(reader, new Term("field",
"C.t"), new JavaUtilRegexCapabilities());
on
WildcardTermEnum regexTermEnum=new WildcardTermEnum(reader, new
Term("field",
"C?t"));
but the result is the same :/
What am I doing wrong ?
> Erick - you're not missing anything, except that the original poster
> is after RegexQuery, not WildcardQuery. Both work basically the same
> way, except in the pattern matching capabilities.
>
> Erik
>
> On Jul 20, 2007, at 5:45 PM, Erick Erickson wrote:
> > Erik:
> >
> > Well, you wrote the book <G>. But I thought something like this
> > would work
> >
> > TermDocs td = reader.termDocs();
> > WildcardTermEnum we = new WildcardTermEnum(reader, new term("field",
> > "c*t"));
> > while (we.next()) {
> > td.seek(we);
> > while (td.next()) {
> > report document contains term;
> > }
> > }
> >
> > Although I admit I haven't tried it, so I could be totally off
> > base. What
> > am I missing?
> >
> > Erick
> >
> > On 7/20/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> >>
> >> Erick - I think you're mixing things up with WildcardQuery.
> >> RegexQuery does support all regex capabilities (depending on the
> >> underlying regex matcher used).
> >>
> >> A couple of techniques you could use to achieve the goal:
> >>
> >> * Use RegexTermEnum, though that'll give you the terms
> >> across the
> >> entire index, so maybe in your use case you could index a single
> >> document into a RAMDirectory and RegexTermEnum on it.
> >>
> >> * Try out SpanRegexQuery and use getSpans() to get the exact
> >> matches.
> >>
> >> Erik
> >>
> >>
> >>
> >> On Jul 20, 2007, at 4:10 PM, Erick Erickson wrote:
> >>
> >> > First, the period (.) isn't part of the syntax, so make sure you
> >> look
> >> > more carefully at the Lucene syntax...
> >> >
> >> > Then, you might be able to use WildcardTermEnum to find
> >> > the terms that match and TermDocs to find the documents
> >> > that contain those terms.
> >> >
> >> > There's nothing built into Lucene to do this out of the box, you
> >> > have to roll your own.
> >> >
> >> > Best
> >> > Erick
> >> >
> >> > On 20 Jul 2007 21:27:40 +0200, [EMAIL PROTECTED]
> >> <[EMAIL PROTECTED]>
> >> > wrote:
> >> >>
> >> >> Hello.
> >> >>
> >> >> Let assume that I have this code in my application:
> >> >>
> >> >> (...)
> >> >> Query query = new RegexQuery(new Term("field", "C.T"));;
> >> >> // searching...
> >> >> (...)
> >> >>
> >> >> And now, I would like to know if my application founded "cat" or
> >> >> "cot" or
> >> >> something else. How can I check what was founded by my
> >> application ?
> >> >>
> >> >> I would like to write application like this:
> >> >> INPUT -> regular expression
> >> >> OUTPUT -> file ---> word
> >> >>
> >> >> example: INPUT = "C.T"
> >> >> OUTPUT =
> >> >> a.txt --> CAT
> >> >> a.txt --> COT
> >> >> b.txt --> CAT
> >> >> b.txt --> CAT
> >> >> b.txt --> COT
> >> >> (...)
> >> >>
> >> >> So, how to check what words were founded in particulary Documents
> >> >> after
> >> >> searching? I see that Hits class contains only founded
> >> documents and
> >> >> nothing more (I am new in this technology so I can be wrong...)
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> ---------------------------------------------------------------------
> >> >> -
> >> >> Dowiedz sie, co naprawde podnieca kobiety. Wiecej wiesz,
> >> latwiej je
> >> >> oczarujesz
> >> >>
> >> >> >>>http://link.interia.pl/f1b17
> >> >>
> >> >>
> >> >>
> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>
> >> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
----------------------------------------------------------------------
Najbogatsza Polka
Zobacz >>> http://link.interia.pl/f1ae2
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]