Hi,
I've been researching about clustering with Lucene. Here is what
I've found so far,
1) Lucene clustering with Carrot2 -
http://download.carrot2.org/head/manual/#section.getting-started.lucene
- but, this seems suitable for only smaller size index (few hundred
documents) -
.)?
With the sizes you report Carrot2 won't work for you, I'm afraid, but
Mahout may. Still, there's plenty of algorithms and preprocessing
options to consider, so if you provide more background somebody may
push you in the right direction.
Dawid
On Tue, Apr 26, 2011 at 1:49 PM, vivek sar
Thanks Dawid. I was trying to give some example, but this is not
exactly our text. Our fields include things like user name, IP
Address, Application Name, Port 3, Byte Count - all network
related stuff. So, if user searches on certain IP address then we
would need to group the result by user,
Hi,
We ran into the same issue (corrupted index) using Lucene 2.4.0.
There was no outage or system reboot - not sure how could it get
corrupted. Here is the exception,
Caused by: java.io.IOException: background merge hit exception:
_io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
(java.lang.Thread.UncaughtExceptionHandler)
Mike
On Sep 17, 2008, at 4:24 PM, vivek sar wrote:
Hi,
We have been running Lucene 2.3 for last few months with our
application and all the sudden we have hit the following exception,
java.lang.RuntimeException: java.io.IOException: background
Hi,
We have been running Lucene 2.3 for last few months with our
application and all the sudden we have hit the following exception,
java.lang.RuntimeException: java.io.IOException: background
merge hit exception: _2uxy:c11345949 _2uxz:c150 _2uy0:c150 _2uy1:c150
_2uy2:c150 _2uy3:c150
Hi,
I'm using 2.3.0 Lucene build and have following merge parameters,
mergeFactor = 100
maxMergeDocs = 9
maxBufferedDocs = 1
maxRAMBufferSizeMB = 200
After running with this setting for a month without problem all the
sudden I'm getting following exception,
.
This is running your latest IndexAccessor-021508 code. Any ideas
(it's kind of urgent for us)?
Thanks,
-vivek
On Fri, Feb 15, 2008 at 6:50 PM, vivek sar [EMAIL PROTECTED] wrote:
Mark,
Thanks for the quick fix. Actually, it is possible that there might
had been simultaneous queries using
can check?
Thanks,
-vivek
On Thu, Feb 28, 2008 at 1:26 PM, vivek sar [EMAIL PROTECTED] wrote:
Mark,
We deployed our indexer (using defaultIndexAccessor) on one of the
production site and getting this error,
Caused by: java.util.concurrent.RejectedExecutionException
this to the test cases.
Just as a personal interest question, what has led you to setup your
index this way? Adding partitions as it grows that is.
- Mark
vivek sar wrote:
Mark,
Yes, I think that's what precisely is happening. I call
accessor.close, which shuts down all
[EMAIL PROTECTED] wrote:
vivek sar wrote:
Mark,
Just for my clarification,
1) Would you have indexStop and indexStart methods? If that's the case
then I don't have to call close() at all. These new methods would
serve as just cleaning up the caches and not closing
Mark,
There seems to be some issue with DefaultMultiIndexAccessor.java. I
got following NPE exception,
2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
java.lang.NullPointerException
at
a foreign MultiSearcher somehow.
I will keep looking and keep you posted. In the mean time, do you have
any other data or code snippets to share?
vivek sar wrote:
Mark,
There seems to be some issue with DefaultMultiIndexAccessor.java. I
got following NPE exception,
2008-02
PROTECTED] wrote:
Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026
vivek sar wrote:
Mark,
There seems to be some issue with DefaultMultiIndexAccessor.java. I
got following NPE exception,
2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl
Hi,
Has anyone tried Luke v0.7.1 with the latest Lucene build, v2.3? I'm
getting Unknown format version: -4 error when opening Lucene 2.3
index with Luke 0.7.1. Is there any upgraded version of Luke anywhere?
I also read something about web-based Luke, but can't find it in the
contrib in 2.3,
I've a field as NO_NORM, does it has to be untokenized to be able to
sort on it?
On Jan 21, 2008 12:47 PM, Antony Bowesman [EMAIL PROTECTED] wrote:
vivek sar wrote:
I need to be able to sort on optime as well, thus need to store it .
Lucene's default sorting does not need the field
a single Document, if that is what you are asking. But you can
create multiple smaller (e.g. weekly indices) instead one large one, and then
every 2 weeks archive the one 2 weeks old.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: vivek sar
Hi,
As a requirement I need to be able to archive any indexes older than
2 weeks (due to space and performance reasons). That means I would
need to maintain weekly indexes. Here are my questions,
1) What's the best way to partition indexes using Lucene?
2) Is there a way I can partition
really mean maxMergeDocs and not maxBufferedDocs?
Larg(er) maxBufferedDocs will speed up indexing.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: vivek sar [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Sent: Friday, January 18, 2008
Hi,
I have a requirement to filter out documents by date range. I'm using
RangeFilter (in combination to FilteredQuery) to do this. I was under
the impression the filtering is done on documents, thus I'm just
storing the date values, but not indexing them. As every new document
would have a new
and not
store them if index size is a concern?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: vivek sar [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Sent: Saturday, January 19, 2008 8:06:25 PM
Subject: Using RangeFilter
Hi,
I have
Hi,
We are using Lucene 2.2. We have an index of size 70G (within 3-4
days) and growing. We run optimize pretty frequently (once every hour
- due to large number of index updates every min - can be up to 100K
new documents every min). I have seen every now and then the optimize
takes 3-4 hours
,
-vivek
On Jan 18, 2008 2:37 AM, Michael McCandless [EMAIL PROTECTED] wrote:
vivek sar wrote:
Hi,
We are using Lucene 2.2. We have an index of size 70G (within 3-4
days) and growing. We run optimize pretty frequently (once every hour
- due to large number of index updates every min
We have seen similar exceptions (with Lucene 2.2) when were doing the
following mistakes,
1) Not closing the old searchers and re-creating a new one for every
new search (fixed it by closing the searcher every time, if you want
you could only one searcher instance as well)
2) Not having any jvm
McCandless [EMAIL PROTECTED] wrote:
vivek sar [EMAIL PROTECTED] wrote:
We are using Lucene 2.3.
Do you mean Lucene 2.2? Your stack trace seems to line up with 2.2,
and 2.3 isn't quite released yet.
The problem we are facing is quite a few times if our application is
stopped (killed
Hi,
We are using Lucene 2.3. The problem we are facing is quite a few
times if our application is stopped (killed or crash) while Indexer is
doing its job, the next time when we bring up the application the
Indexer fails to run with the following exception,
2007-10-04 12:29:53,089 ERROR [PS
26 matches
Mail list logo