Re: WilcardQuery and memory

2007-03-09 Thread Erick Erickson
it also contains the terms yahoo.de and yahoo and de). Having whole e-mail addresses in terms and using prefix/wildcard queries inevitably results in too many clauses. -Original Message- From: Joe [mailto:[EMAIL PROTECTED] Sent: 09 March 2007 12:08 To: java-user@lucene.apache.org Sub

Re: WilcardQuery and memory

2007-03-09 Thread Joe
Hi Rob, For indexing e-mail, I recommend that you tokenise the e-mail addresses into fragments and query on the fragments as whole terms rather than using wildcards. [example] Hm for email adresses this isnt a big problem here. The real problem is the query on the body part of an email, wh

RE: WilcardQuery and memory

2007-03-09 Thread Rob Staveley (Tom)
-user@lucene.apache.org Subject: WilcardQuery and memory Hi, Here we use lucene to index our emails, currently 500.000 Documents. When Searching the body by a WildcardQuery the problems arises. I did some profiling with JProfiler. I see the more BooleanClause instances used the more memory is

WilcardQuery and memory

2007-03-09 Thread Joe
Hi, Here we use lucene to index our emails, currently 500.000 Documents. When Searching the body by a WildcardQuery the problems arises. I did some profiling with JProfiler. I see the more BooleanClause instances used the more memory is required during search. Most memory is used by instances