Hi all,
I have an index directory that is growing pretty fast, and is now at 138GB.
A while ago, this index got corrupted. It was rebuilt, but the
engineer cannot remember whether he deleted the corrupt directory
before the rebuild. Is there a way to know if any files are not being
used or
Optimize the index one time then all unused segment files are *for-sure*
removed.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Andrew Bruno [mailto:andrew.br...@gmail.com]
Sent: Saturday, August 14,
Hello Everyone,
Can anyone point me to a publicly Question answering system built using
lucene on TREC or non-TREC data.
Regards,
Ramneek
You can also call deleteUnusedFiles(), and all unreferenced files will be
deleted either. Make sure to set the index DeletionPolicy to
KeepOnlyLastCommit (which is the default), before you do that. That's
relevant though if you've built the index using either 3x or 4.0 code.
If not, you can
As asked, that's really an unanswerable question. The math is pretty easy
in terms of running out of document IDs, but searched quickly depends
on too many variables.
I suspect, though, that long before you ran out of document IDs, you'd need
to shard your index, Have you looked at SOLR?
Best
Hi Erick,
My documents are roughly a 0.5 to 1 million chars divide into normal words,
and divided into 50 chapters, each chapter streamed into a docid unit. So a
search hit is a chapter.
How do I find out more about sharding and SOLR?
Andy
--
View this message in context:
Hi,
Can anyone explain to me how exactly the Tokenizers and tokenattributes
interact with each other?
Or perhaps point me to a link which has a the interaction/sequence diagram
for the same?
I want to extend the Token class to allow use of some more types of Token
Attributes.
Thanks
-Devshree.
I was setting up a new instance of my program on a new computer. I got
this error:
2010-08-14 10:05:21,951 ERROR Thread LuceneThread:
java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
Java stacktrace:
java.lang.IndexOutOfBoundsException: Not a valid hit number: 0
at
You might wanna look at the Whats new in Lucene 2.9 Whitepaper from
Lucid Imagination
http://www.lucidimagination.com/developer/whitepaper/Whats-New-in-Apache-Lucene-2-9
on page 7 you find an introduction to this API. This should get you started :)
simon
On Sat, Aug 14, 2010 at 4:19 PM,
Lucene has been used - usually as a starting base that has been
modified for specific tasks - by a number of IR researchers for
various TREC challenges. Here are some (there are many more):
IBM Haifa:
http://wiki.apache.org/lucene-java/TREC_2007_Million_Queries_Track_-_IBM_Haifa_Team
10 matches
Mail list logo