it also contains the terms yahoo.de and yahoo
and de).
Having whole e-mail addresses in terms and using prefix/wildcard queries
inevitably results in too many clauses.
-Original Message-
From: Joe [mailto:[EMAIL PROTECTED]
Sent: 09 March 2007 12:08
To: java-user@lucene.apache.org
Sub
Hi Rob,
For indexing e-mail, I recommend that you tokenise the e-mail addresses into
fragments and query on the fragments as whole terms rather than using
wildcards.
[example]
Hm for email adresses this isnt a big problem here.
The real problem is the query on the body part of an email, wh
-user@lucene.apache.org
Subject: WilcardQuery and memory
Hi,
Here we use lucene to index our emails, currently 500.000 Documents.
When Searching the body by a WildcardQuery the problems arises.
I did some profiling with JProfiler. I see the more BooleanClause
instances used
the more memory is
Hi,
Here we use lucene to index our emails, currently 500.000 Documents.
When Searching the body by a WildcardQuery the problems arises.
I did some profiling with JProfiler. I see the more BooleanClause
instances used
the more memory is required during search.
Most memory is used by instances