Thanks Kannan
Rob
- Original Message -
From: Chellappa, Kannan [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Wednesday, July 21, 2004 12:19 PM
Subject: RE: Slightly off topic, I need to have luke use my Analyzer
Sorry typo in the version date in my previous mail -- I
Hi,
I have been looking at how Lucene operates with queries where all terms are
required. I expected that the algorithm would step through each term to
confirm that it did exist in the index and as soon as a clause is found that
does not occur, the search would be aborted. As far as I can tell
Hi all,
I have a question related to reindexing of documents with lucene.
We want to implement the functinality of rebuilding lucene index.
That means I want to delete all documents in the index and to add newer
versions.
All information I need to reindex is kept in the database so that I have
a
Why don't you just build a new index in a different location and at the end
add the missing documents from the old index to the new one, and then delete
the old index.
Aviran
-Original Message-
From: Sergiu Gordea [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 22, 2004 10:49 AM
To:
Because on the other hand I want to have a clean index, without any kind
of garbage.
This is the requested funtionality of the rebuild index function.
Clean Index and don't loose data.
I was also thinking that I can delete the index location and create a
new index, this may have the same effect
Hi, Lucene Guru:
I wonder if the information in termPositions or termVector can be used
to restore token position from indicies?
Thanks!
Roy
On Wed, 21 Jul 2004 21:32:10 +0100, [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
I need these values for hihglighting. I've already looked to
Thanks for the pointer to Luke. It's a very useful tool.
I did more research but don't think Lucene stores the token position
in indicies. Token position is different from term position. So, for
highlighting, the original text has to be retokenized again.
On Thu, 22 Jul 2004 19:44:14 +0200,
I also question whether it could handle extreme volume with such good query
speed.
Has anyone done numbers with 1+ million documents?
-Original Message-
From: Daniel Naber [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 20, 2004 5:44 PM
To: Lucene Users List
Subject: Re: Lucene vs. MySQL
I used the MySQL full text search to index about 70K business directory
records. It became impossibly slow and I ended up creating my own text
search engine similar in concept to Lucene but database driven. It worked
much faster than the native MySQL full text search.
Other limitations of MySQL
I wonder if the information in termPositions or termVector can be used
to restore token position from indicies?
TermFreqVector gives you term frequencies (not positions). This can be of use in
computing document
similarities.
TermPositions gives you the sequence number . eg in the last
I am sensing a common theme throughout a variety of threads here: Namely, a need for
a pluggable set of Reader's and Writers (think Interface) that can write metadata
about an Index/Document/Field/Term (which I see the TermVector stuff as being a
instance of) and can be given to Lucene from
On Thu, 22 Jul 2004 14:19:21 -0400, [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
It could also be that your disk space is filling up and the OS runs out of
swap room.
If you run Fedora you will also need to upgrade your kernel. There is
a severe bug with Java crashing on the default kernels.. If
On Wed, 21 Jul 2004 22:13:32 +1000, Anson Lau [EMAIL PROTECTED] wrote:
Has anyone tried splitting up an index into smaller chunks, without putting
the different indicies on a different physical disk/box? What sort of
performance gain do you get from it?
The best advantage to this would be
Hi Byron,
I am planning on benchmarking Nutch on Opteron box ( 2 CPU, 2 TB, 2 Gig RAM)
using Fedora Core rc2 and jdk 1.5 beta 2. Are there any issues I should be
aware of?
Thanks for the help,
ram
On 7/22/04 4:56 PM, Byron Miller [EMAIL PROTECTED] wrote:
On Thu, 22 Jul 2004 14:19:21 -0400,
14 matches
Mail list logo