try your query like ((ducted^1000 duct~2) +tape)
Or maybe (duct* +tape)
or even better you could try to do some stemming (Porter stemmer should get rid
of these ed-suffixes) and some of the above
if this does not help, have a look at lingpipe spellChecker class as this looks
like exactly what
(1)... threaddump
It hadn't occurred to me that I'd be able to do that, Chris. It took me a
little while to figure out how, because I'm running the application as a
daemon (i.e. using nohup, teeing standard output to a log file and
redirecting stdout and stderr to /dev/null), which counts out
I think the problem in your particular example is the
suggestion software has no consideration of context.
I've been playing with context-sensitive suggestions
recently which take a bunch of validated (ie existing)
words (eg tape) and use this to help shortlist
alternatives for an unknown or
On Tue, 2006-06-06 at 22:23 +, michael turner wrote:
Working on a project that requires a Search query similiar to what is
seen onamazon.com in that after searching for and displaying an item,
the system shows:
Users that have searched for A AND B have also searched
for .
Hi,
I am working on a struts application using lucene for indexing mysql
database. Whenever I rebuild the application and deploy in tomcat and try to
rebuild the index from scratch I have to shutdown tomcat and then restart it
again. In case I don't do this I get IOException while creating
Hi,
I am trying to implement an alternative scoring mechanism in Lucene.
A query of multiple terms is represented as a BooleanQuery with one or more
Occur.SHOULD clauses.
The scoring for a document is very simple:
- Assign a score for each queryterm.
! If a document does not contain a
Hi folks,
Has anyone done or do you know of an API library that will take SQL
statement and convert them to Lucene Query? I know not every SQL statement
can become a Lucene Query but that's OK as long as the library will
highlight them.
Thanks!
-- George
I was able to improve the behavior by setting the mapped ByteBuffer to null
in the close method of MMapIndexInput. This seems to be a strong enough
'suggestion' to the gc, as I can see the references go away with process
explorer, and the index files can be deleted, usually. Occasionally, a
You could take a look at Apaches Jackrabbit - it does this sort of thing.
Its not exactly a library but it might give you some pointers. My
understanding is that it uses an SQL like syntax for defining queries that
are converted into an abstract syntax tree which it can then convert into
any
Not explict closing can lead especially when is allowed a lot of memory
to JVM but small amount is used that old files will stay on the disk on
linux.
Solution is in using ReentrantReadWriteLock
where the re-open method opens new indexreader at ThreadLocal
accuire write lock
saves old reference
: However, I'm not sure what to make of:
: 8
: Thread 3740: (state = BLOCKED)
: - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
: - java.lang.Object.wait() @bci=2, line=474 (Compiled frame)
: Error occurred during stack walking:
: java.lang.NullPointerException
: at
I have a very large corpus that I am storing in many indexes: 200 indexes
* ~500MB each, with 10^6 very tiny documents in each. (I could look into
optimizing this later, of course, but seems ok for now)
During indexing, I have been using a RAMDirectory to store many thousands of
documents in
On 6/7/06, Benjamin Stein [EMAIL PROTECTED] wrote:
During indexing, I have been using a RAMDirectory to store many thousands
of documents in memory before flushing the buffer to disk using
IndexWriter.addIndexes.
For the most part this works very well, except that performance degrades
My understanding of the IndexWriter code is that it more or less manages
this for you. It has an internal RAMDirectory which it uses to index in
memory and then periodically flushes to disk based on your merge factor
settings (amongst other settings). So I am not sure if the extra work
you
Benjamin Stein wrote:
I could probably store the little RAMDirectories to disk as many
FSDirectories, and then addIndexes() of *all* the FSDirectories at the end
instead of every time. That would probably be smart.
Glad I asked myself!
That was what I was going to suggest - you may also
yeah you are right. I was talking about going through the large index and
discovering the problem as to where else it have occured and how?
But thanks for your tips. I use Luke before and certianly it helped me this
time aswell. I found the problem. Its not great Lucene. Its me.. error in
I'm not sure what exactly your process method is doing
In essence it gets text from the content's input stream and writes it to the
PipedWriter and hence to the PipedReader passed to the Field constructor.
The process method for a plain text content handler simply copies from the
input stream
On Tue, 2006-06-06 at 22:23 +, michael turner wrote:
Users that have searched for A AND B have also searched
for .
Something just hit me.
Perhaps it would be interesting for you to track sessions that search
for the same thing but don't seem to find what they are looking
18 matches
Mail list logo