Re: Time taken for CheckIndex on a core

2017-08-17 Thread Shashank Pedamallu
Thanks a lot for the reply Shawn! That makes sense.

Thanks,
Shahsank

On 8/17/17, 11:37 AM, "Shawn Heisey" <apa...@elyograg.org> wrote:

On 8/14/2017 8:19 PM, Shashank Pedamallu wrote:
> I was looking for a reliable method to check if my core was in a
> consistent readable state and came across
> org.apache.lucene.index.CheckIndex class. However, when I tried to run
> this command multiple times from the command-line via the main utility
> method provided, it takes long time to run for the first time, while
> subsequent runs are much much faster. Can someone explain why this is?
> Does Solr/Lucene cache the results of CheckIndex? Note that the
> attempt runs are on the same core with same segments.

This is indeed a Lucene question, but that doesn't mean it will get
ignored.  If you want more detail than this, try the Lucene java-user
mailing list.

This is probably the OS caching the index files.  Lucene itself does not
cache the data, especially from separate invocations of CheckIndex, but
the operating system does.

The first time you run it, your server may need to actually read all the
index data from the disk, so you're dealing with the speed of the disk
as a bottleneck.  The second time, part or possibly all of the index
data will have been cached by the operating system in memory, so it
loads MUCH faster, because it's pulled from memory, not the actual disk.

Thanks,
Shawn





Time taken for CheckIndex on a core

2017-08-14 Thread Shashank Pedamallu
Hi,

I was looking for a reliable method to check if my core was in a consistent 
readable state and came across org.apache.lucene.index.CheckIndex class. 
However, when I tried to run this command multiple times from the command-line 
via the main utility method provided, it takes long time to run for the first 
time, while subsequent runs are much much faster. Can someone explain why this 
is? Does Solr/Lucene cache the results of CheckIndex? Note that the attempt 
runs are on the same core with same segments.

Here are the results from attempt 1:
 grep "check integrity" checkFastIndex70e0a5core7_a1.out
test: check integrity.OK [took 37.747 sec]
test: check integrity.OK [took 37.572 sec]
test: check integrity.OK [took 36.763 sec]
test: check integrity.OK [took 35.927 sec]
test: check integrity.OK [took 35.714 sec]
test: check integrity.OK [took 37.975 sec]
test: check integrity.OK [took 37.685 sec]
test: check integrity.OK [took 35.516 sec]
test: check integrity.OK [took 9.359 sec]
test: check integrity.OK [took 2.016 sec]
test: check integrity.OK [took 5.014 sec]
test: check integrity.OK [took 3.276 sec]
test: check integrity.OK [took 2.134 sec]
test: check integrity.OK [took 0.525 sec]
test: check integrity.OK [took 0.304 sec]
test: check integrity.OK [took 0.099 sec]
test: check integrity.OK [took 0.409 sec]
test: check integrity.OK [took 0.160 sec]
test: check integrity.OK [took 0.130 sec]
test: check integrity.OK [took 0.449 sec]
test: check integrity.OK [took 0.229 sec]
test: check integrity.OK [took 0.093 sec]
test: check integrity.OK [took 0.019 sec]
test: check integrity.OK [took 0.083 sec]
test: check integrity.OK [took 0.177 sec]
test: check integrity.OK [took 0.001 sec]
test: check integrity.OK [took 0.000 sec]
test: check integrity.OK [took 0.023 sec]
test: check integrity.OK [took 0.002 sec]
test: check integrity.OK [took 0.006 sec]
test: check integrity.OK [took 0.000 sec]

Results from attempt 2:
grep "check integrity" checkFastIndex70e0a5core7_a2.out
test: check integrity.OK [took 1.904 sec]
test: check integrity.OK [took 1.785 sec]
test: check integrity.OK [took 1.704 sec]
test: check integrity.OK [took 1.594 sec]
test: check integrity.OK [took 1.338 sec]
test: check integrity.OK [took 1.414 sec]
test: check integrity.OK [took 1.490 sec]
test: check integrity.OK [took 1.291 sec]
test: check integrity.OK [took 0.366 sec]
test: check integrity.OK [took 0.072 sec]
test: check integrity.OK [took 0.183 sec]
test: check integrity.OK [took 0.175 sec]
test: check integrity.OK [took 0.072 sec]
test: check integrity.OK [took 0.020 sec]
test: check integrity.OK [took 0.012 sec]
test: check integrity.OK [took 0.004 sec]
test: check integrity.OK [took 0.028 sec]
test: check integrity.OK [took 0.009 sec]
test: check integrity.OK [took 0.007 sec]
test: check integrity.OK [took 0.021 sec]
test: check integrity.OK [took 0.013 sec]
test: check integrity.OK [took 0.005 sec]
test: check integrity.OK [took 0.001 sec]
test: check integrity.OK [took 0.004 sec]
test: check integrity.OK [took 0.006 sec]
test: check integrity.OK [took 0.000 sec]
test: check integrity.OK [took 0.000 sec]
test: check integrity.OK [took 0.001 sec]
test: check integrity.OK [took 0.000 sec]
test: check integrity.OK [took 0.000 sec]
test: check integrity.OK [took 0.000 sec]

This probably falls under lucene code, so, please let me know if I have to post 
it in lucene group for an answer.

Thanks,
Shashank


CheckIndex failed for Solr 4.7.2 index

2015-06-09 Thread Guy Moshkowich
We are using Solr 4.7.2 and we found that when we run 
CheckIndex.checkIndex on one of the Solr shards we are getting the error 
below.
Both replicas of the shard had the same error.
The shard index looked healthy:
1) It appeared active in the Solr admin page.
2) We could run searches against it.
3) No relevant errors where found in Solr logs.
4) After we optimized the index in LUKE, CheckIndex did not report any 
error.

My questions:
1) Is this is a real issue or a known bug in CheckIndex code that cause 
false negative ?
2) Is there a known fix for this issue?

Here is the error we got:
 validateIndex Segments file=segments_bhe numSegments=15 version=4.7 
format= userData={commitTimeMSec=1432689607801}
  1 of 15: name=_6cth docCount=248744
codec=Lucene46
compound=false
numFiles=11
size (MB)=86.542
diagnostics = {timestamp=1428883354605, os=Linux, 
os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge, 
lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64, 
mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
has deletions [delGen=3174]
test: open reader.FAILED
WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs 
bits=156872
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)

Appreciate yout help,
Guy.


Re: CheckIndex failed for Solr 4.7.2 index

2015-06-09 Thread Michael McCandless
IBM's J9 JVM unfortunately still has a number of nasty bugs affecting
Lucene; most likely you are hitting one of these.  We used to test J9
in our continuous Jenkins jobs, but there were just too many
J9-specific failures and we couldn't get IBM's attention to resolve
them, so we stopped.  For now you should switch to Oracle JDK, or
OpenJDK.

But there's some good news!  Recently, a member from the IBM JDK team
replied to this Elasticsearch thread:
https://discuss.elastic.co/t/need-help-with-ibm-jdk-issues-with-es-1-4-5/1748/3

And then Robert Muir ran Lucene's tests with the latest J9 and opened
several issues; see the 2nd bullet under Apache Lucene at
https://www.elastic.co/blog/this-week-in-elasticsearch-and-apache-lucene-2015-06-09
and at least one of the issues seems to be making progress
(https://issues.apache.org/jira/browse/LUCENE-6522).

So there is hope for the future, but for today it's too dangerous to
use J9 with Lucene/Solr/Elasticsearch.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 9, 2015 at 12:23 PM, Guy Moshkowich g...@il.ibm.com wrote:
 We are using Solr 4.7.2 and we found that when we run
 CheckIndex.checkIndex on one of the Solr shards we are getting the error
 below.
 Both replicas of the shard had the same error.
 The shard index looked healthy:
 1) It appeared active in the Solr admin page.
 2) We could run searches against it.
 3) No relevant errors where found in Solr logs.
 4) After we optimized the index in LUKE, CheckIndex did not report any
 error.

 My questions:
 1) Is this is a real issue or a known bug in CheckIndex code that cause
 false negative ?
 2) Is there a known fix for this issue?

 Here is the error we got:
  validateIndex Segments file=segments_bhe numSegments=15 version=4.7
 format= userData={commitTimeMSec=1432689607801}
   1 of 15: name=_6cth docCount=248744
 codec=Lucene46
 compound=false
 numFiles=11
 size (MB)=86.542
 diagnostics = {timestamp=1428883354605, os=Linux,
 os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge,
 lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64,
 mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation}
 has deletions [delGen=3174]
 test: open reader.FAILED
 WARNING: fixIndex() would remove reference to this segment; full
 exception:
 java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs
 bits=156872
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)

 Appreciate yout help,
 Guy.


CheckIndex question

2012-10-17 Thread Jie Sun
Hi -

with a corrupted core, 

1. if I run CheckIndex with -fix, it will drop the hook to the corrupted
segment, but the segment files are still there, when we have a lot of
corrupted segments, we have to manually pick them out and remove them, is
there a way the tool can suffix them or prefix them so it is easier to be
cleaned out?

2. we know the doc count in the corrupted segment, is it easy also output
the doc id on those docs?

thanks
Jie



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CheckIndex-question-tp4014366.html
Sent from the Solr - User mailing list archive at Nabble.com.


checkindex

2010-01-08 Thread Giovanni Fernandez-Kincade
I've seen many mentions of the Lucene CheckIndex tool, but where can I find it? 
Is there any documentation on how to use it?

I noticed Luke has it built-in, but I can't get Luke to open my index with the 
Don't open IndexReader(when opening corrupted index) option check. Opening 
even an index I know is valid doesn't work using this option:
[cid:image001.png@01CA9085.38516930]

[cid:image002.png@01CA9085.55D764F0]


Re: checkindex

2010-01-08 Thread Ian Kallen

When I needed to use it, I couldn't find docs for it either but it's straight 
forward. Here's what I did:
un-jar the solr war file to find the lucene jar that solr was using and run 
CheckIndex like this
java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex 
/path/to/solr/data/index/
to actually *fix* the index, add the -fix argument
java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex -fix 
/path/to/solr/data/index/

hope that helps,
-Ian


On 1/8/10 2:09 PM, Giovanni Fernandez-Kincade wrote:


I've seen many mentions of the Lucene CheckIndex tool, but where can I 
find it? Is there any documentation on how to use it?


I noticed Luke has it built-in, but I can't get Luke to open my index 
with the Don't open IndexReader(when opening corrupted index) option 
check. Opening even an index I know is valid doesn't work using this 
option:





--
Ian Kallen
blog: http://www.arachna.com/roller/spidaman
tweetz: http://twitter.com/spidaman
vox: 925.385.8426




RE: checkindex

2010-01-08 Thread Giovanni Fernandez-Kincade
Yeah that worked. Thanks!

-Original Message-
From: Ian Kallen [mailto:spidaman.l...@gmail.com] 
Sent: Friday, January 08, 2010 5:32 PM
To: solr-user@lucene.apache.org
Subject: Re: checkindex

When I needed to use it, I couldn't find docs for it either but it's straight 
forward. Here's what I did:
un-jar the solr war file to find the lucene jar that solr was using and run 
CheckIndex like this
java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex 
/path/to/solr/data/index/
to actually *fix* the index, add the -fix argument
java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex -fix 
/path/to/solr/data/index/

hope that helps,
-Ian


On 1/8/10 2:09 PM, Giovanni Fernandez-Kincade wrote:

 I've seen many mentions of the Lucene CheckIndex tool, but where can I 
 find it? Is there any documentation on how to use it?

 I noticed Luke has it built-in, but I can't get Luke to open my index 
 with the Don't open IndexReader(when opening corrupted index) option 
 check. Opening even an index I know is valid doesn't work using this 
 option:



-- 
Ian Kallen
blog: http://www.arachna.com/roller/spidaman
tweetz: http://twitter.com/spidaman
vox: 925.385.8426