You probably need to increase the amount of RAM available to your JVM.
See the parameters:
-Xmx :Maximum memory usable by the JVM
-Xms :Initial memory allocated to JVM
My params are; -Xmx2048m -Xms128m (2G max, 128M initial)
On Fri, 10 Dec 2004 11:17:29 -0600, Sildy Augustine
<[EMAIL PR
You unstored fields were not stored in the index, only their terms
were stored. When you get the document from the index and modify it,
those terms are lost when you add the document again.
You can either simply create a new document and populate all the
fields and add that document to the index,
Somehow today one of my indexes became corrupted.
I get the following IO exception when trying to open the index:
Exception in thread "main" java.io.IOException: read past EOF
at org.en.lucene.store.InputStream.refill(InputStream.java:154)
at org.en.lucene.store.InputStream.readB
You can only have one open writer at a time. A writer is either an
IndexWriter object, or an IndexReader object that has modified the
index, by deleting documents for instance.
You must close your existing writer before you open a new one.
You should not get lock exceptions with IndexSearchers.
As a generalisation, SuSE itself is not a lot slower than Windows XP.
I also very much doubt that filesystem is a factor. If you want to
test w/out filesystem involvement, simply load your index into a
RAMDirectory instead of using FSDirectory. That precludes filesystem
overhead in searches.
Th
On Tue, 30 Nov 2004 12:07:46 -, Pete Lewis <[EMAIL PROTECTED]> wrote:
> Also, unless you take your hyperthreading off, with just one index you are
> searching with just one half of the CPU - so your desktop is actually using
> a 1.5GHz CPU for the search. So, taking account of this its not too
My indexes are stored on a NetApp filter via NFS. The indexer process
updates the indexes over NFS. I have multiple indexes. My search
process determines if the nfs indexes have been updated, and if they
have, then loads the index into a RAMDirectory. RAMDirectory is of
course much faster than
BoundsException ioobe)
> {
> logger.error("INDEX OUT OF BOUNDS!" + ioobe.getMessage());
> ioobe.printStackTrace();
> }
> }
> reader.close();
> //logger.debug("done, about the optimize");
>
I have two index processes. One is an index server, the other is a
search server. The processes run on different machines.
The index server is a single threaded process that reads from the
database and adds
unindexed rows to the index as needed. It sleeps for a couple minutes
between each
batch
Split the filename into "basefilename" and "version" and make each a keyword.
Sort your query by version descending, and only use the first
"basefile" you encounter.
On Wed, 17 Nov 2004 15:05:19 -0500, Luke Shannon
<[EMAIL PROTECTED]> wrote:
> Hey all;
>
> I have ran into an interesting case.
>
The HEAD version of CVS supports gz compression. You will need to
check it out using cvs if you want to use it.
On Wed, 17 Nov 2004 21:43:36 +0200, abdulrahman galal <[EMAIL PROTECTED]> wrote:
> i noticed in the last period that alot of people disscus with each others
> about the bugs of lucene
You could lock your index for writes, then copy the file using
operating system copy commands.
Another way would be to lock your index, make a filesystem snapshot,
then unlock your index. You can then safely copy the snapshot without
interupting further index operations.
On Wed, 17 Nov 2004 11:2
Try using 1.4.2. The change file says that
ArrayIndexOutOfBoundsExceptions have been fixed in the queryparser.
On Fri, 12 Nov 2004 12:04:31 -0500, Will Allen <[EMAIL PROTECTED]> wrote:
> Holy cow! This does happen!
>
>
>
> -Original Message-
> From: Peter Pimley [mailto:[EMAIL PROTEC
You can add the category keyword multiple times to a document.
Instead of seperating your categories with a delimiter, just add the
keyword multiple times.
doc.add(Field.Keyword("category", "ABC");
doc.add(Field.Keyword("category", "DEF GHI");
On Tue, 9 Nov 2004 17:18:19 +0100, Thierry Ferrero (
You can write to the index and read from it at the same time. You can
only have one IndexWriter open at any one time.
IndexSearchers will only see documents that were created before they
were instantiated, so you need to create new ones periodically to see
new documents.
On Mon, 8 Nov 2004 14:26
The reason this is failing is because you are trying to create a new
index in the directory. It works on *nix file systems because you can
delete an open file on those operating systems, something you can't do
under Windows.
If you change the create parameter to false on your second call
everythi
You may also want to investigate the CVSIGNORE environment variable.
You can tell CVS to ignore any files and directories specified in this
variable (it is space seperated)
So you could tell CVS to ignore all directories named lucene with:
export CVSIGNORE=lucene
On Fri, 5 Nov 2004 09:03:00 -0
You should exclude your lucene index from the CVS repository. This is
the same thing you would do if you had a process that generated files
in your source tree from other files. The generated files wouldn't
have any meaning in the repository, and can be regenerated at any
time, so you would want
I'm thinking about making a seperate field in my index for prefix
wildcard searches.
I would chop off x characters from the front to create "subtokens" for
the prefix matches.
For the term: republican
terms created: republican epublican publican ublican blican
My query parser would then intellige
First off, I think you should make a decision about what you want to
store in your index and how you go about searching it.
The less information you store in your index, the better, for
performance reasons. If you can store the messages in an external
database you probably should. I would create
If you know all the phrases your are going to search for, you could
modify an analyzer to make those phrases into whole terms when you are
analyzing.
Other than that, you can test the speed of breaking the phrase query
up into term queries. You would have to do an AND on all the words in
the phra
Given an FSDirectory based index A.
Documents are added to A with an IndexWriter
minMergeDocs = 2
mergeFactor = 3
Documents are never deleted.
Once the RAMDirectory merges documents to the index:
a) will the documentID values for index A ever change?
b) can a mapping between a term in th
absolutely correct. sorry about that. shouldn't code before coffee :)
On Thu, 28 Oct 2004 20:16:16 +0200, Daniel Naber
<[EMAIL PROTECTED]> wrote:
> On Thursday 28 October 2004 19:03, Justin Swanhart wrote:
>
> > Have you tried making a term query by hand and testing
Have you tried making a term query by hand and testing to see if it works?
Term t = new Term("field", "this is a \"test\"");
PhraseQuery pq = new PhraseQuery(t);
...
On Thu, 28 Oct 2004 12:02:48 -0400, Will Allen <[EMAIL PROTECTED]> wrote:
>
> I am having this same problem, but cannot find a
your analyzer will have removed the stopword when you indexed your documents, so
lucene won't be able to do this for you.
You will need to implement a second pass over the results returned by lucene and
check to see if the stopword is included, perhaps with String.indexOf()
On Wed, 27 Oct 2004 1
You could always modify your own local copy if you want to change the
behavior of the parameter.
or just do:
IndexWriter w = new IndexWriter(indexDirectory,
new StandardAnalyzer(),
!(IndexReader.indexEx
I would suggest that you create a lock file for your index writing
process, if the lock file is encountered close the IndexWriter until
the lock file is removed. After you create the lockfile, wait a few
seconds to make sure the writer process has quiesced, then create a
snapshot of the filesystem
you could just request all the messages from the list bot, then index
them with lucene :)
On Thu, 14 Oct 2004 16:50:19 +, sam s <[EMAIL PROTECTED]> wrote:
> Hi Folks,
> Is there any place where I can do a better search on lucene mailing
> archives?
> I tried JGuru and looks like their search
The overhead of creating that many searcher objects is going to far
outweigh any performance benefit you could possibly hope to gain by
splitting your index up.
On Thu, 14 Oct 2004 04:42:27 -0700 (PDT), Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Search a single merged index.
>
> Otis
>
>
>
It depends on a lot of factors. I myself use multiple indexes for
about 10M documents.
My documents are transient. Each day I get about 400K and I remove
about 400K. I
always remove an entire days documents at one time. It is much
faster/easier to delete
the lucene index for the day that I am r
Yes you can reuse analyzers. The only performance gain will come from
not having to create the objects and not having garbage collection
overhead. I create one for each of my index reading threads.
On Thu, 07 Oct 2004 16:59:38 +, sam s <[EMAIL PROTECTED]> wrote:
> Hi,
> Can instance of an an
As I understand it, if two writers try to acess the same index for
writing, then one of the writers should block waiting for a lock until
the lock timeout period expires, and then they will return a "Lock
wait timeout" exception.
I have a multithreaded indexing applications that writes into one of
32 matches
Mail list logo