Re: Time taken for CheckIndex on a core
Thanks a lot for the reply Shawn! That makes sense. Thanks, Shahsank On 8/17/17, 11:37 AM, "Shawn Heisey" <apa...@elyograg.org> wrote: On 8/14/2017 8:19 PM, Shashank Pedamallu wrote: > I was looking for a reliable method to check if my core was in a > consistent readable state and came across > org.apache.lucene.index.CheckIndex class. However, when I tried to run > this command multiple times from the command-line via the main utility > method provided, it takes long time to run for the first time, while > subsequent runs are much much faster. Can someone explain why this is? > Does Solr/Lucene cache the results of CheckIndex? Note that the > attempt runs are on the same core with same segments. This is indeed a Lucene question, but that doesn't mean it will get ignored. If you want more detail than this, try the Lucene java-user mailing list. This is probably the OS caching the index files. Lucene itself does not cache the data, especially from separate invocations of CheckIndex, but the operating system does. The first time you run it, your server may need to actually read all the index data from the disk, so you're dealing with the speed of the disk as a bottleneck. The second time, part or possibly all of the index data will have been cached by the operating system in memory, so it loads MUCH faster, because it's pulled from memory, not the actual disk. Thanks, Shawn
Time taken for CheckIndex on a core
Hi, I was looking for a reliable method to check if my core was in a consistent readable state and came across org.apache.lucene.index.CheckIndex class. However, when I tried to run this command multiple times from the command-line via the main utility method provided, it takes long time to run for the first time, while subsequent runs are much much faster. Can someone explain why this is? Does Solr/Lucene cache the results of CheckIndex? Note that the attempt runs are on the same core with same segments. Here are the results from attempt 1: grep "check integrity" checkFastIndex70e0a5core7_a1.out test: check integrity.OK [took 37.747 sec] test: check integrity.OK [took 37.572 sec] test: check integrity.OK [took 36.763 sec] test: check integrity.OK [took 35.927 sec] test: check integrity.OK [took 35.714 sec] test: check integrity.OK [took 37.975 sec] test: check integrity.OK [took 37.685 sec] test: check integrity.OK [took 35.516 sec] test: check integrity.OK [took 9.359 sec] test: check integrity.OK [took 2.016 sec] test: check integrity.OK [took 5.014 sec] test: check integrity.OK [took 3.276 sec] test: check integrity.OK [took 2.134 sec] test: check integrity.OK [took 0.525 sec] test: check integrity.OK [took 0.304 sec] test: check integrity.OK [took 0.099 sec] test: check integrity.OK [took 0.409 sec] test: check integrity.OK [took 0.160 sec] test: check integrity.OK [took 0.130 sec] test: check integrity.OK [took 0.449 sec] test: check integrity.OK [took 0.229 sec] test: check integrity.OK [took 0.093 sec] test: check integrity.OK [took 0.019 sec] test: check integrity.OK [took 0.083 sec] test: check integrity.OK [took 0.177 sec] test: check integrity.OK [took 0.001 sec] test: check integrity.OK [took 0.000 sec] test: check integrity.OK [took 0.023 sec] test: check integrity.OK [took 0.002 sec] test: check integrity.OK [took 0.006 sec] test: check integrity.OK [took 0.000 sec] Results from attempt 2: grep "check integrity" checkFastIndex70e0a5core7_a2.out test: check integrity.OK [took 1.904 sec] test: check integrity.OK [took 1.785 sec] test: check integrity.OK [took 1.704 sec] test: check integrity.OK [took 1.594 sec] test: check integrity.OK [took 1.338 sec] test: check integrity.OK [took 1.414 sec] test: check integrity.OK [took 1.490 sec] test: check integrity.OK [took 1.291 sec] test: check integrity.OK [took 0.366 sec] test: check integrity.OK [took 0.072 sec] test: check integrity.OK [took 0.183 sec] test: check integrity.OK [took 0.175 sec] test: check integrity.OK [took 0.072 sec] test: check integrity.OK [took 0.020 sec] test: check integrity.OK [took 0.012 sec] test: check integrity.OK [took 0.004 sec] test: check integrity.OK [took 0.028 sec] test: check integrity.OK [took 0.009 sec] test: check integrity.OK [took 0.007 sec] test: check integrity.OK [took 0.021 sec] test: check integrity.OK [took 0.013 sec] test: check integrity.OK [took 0.005 sec] test: check integrity.OK [took 0.001 sec] test: check integrity.OK [took 0.004 sec] test: check integrity.OK [took 0.006 sec] test: check integrity.OK [took 0.000 sec] test: check integrity.OK [took 0.000 sec] test: check integrity.OK [took 0.001 sec] test: check integrity.OK [took 0.000 sec] test: check integrity.OK [took 0.000 sec] test: check integrity.OK [took 0.000 sec] This probably falls under lucene code, so, please let me know if I have to post it in lucene group for an answer. Thanks, Shashank
CheckIndex failed for Solr 4.7.2 index
We are using Solr 4.7.2 and we found that when we run CheckIndex.checkIndex on one of the Solr shards we are getting the error below. Both replicas of the shard had the same error. The shard index looked healthy: 1) It appeared active in the Solr admin page. 2) We could run searches against it. 3) No relevant errors where found in Solr logs. 4) After we optimized the index in LUKE, CheckIndex did not report any error. My questions: 1) Is this is a real issue or a known bug in CheckIndex code that cause false negative ? 2) Is there a known fix for this issue? Here is the error we got: validateIndex Segments file=segments_bhe numSegments=15 version=4.7 format= userData={commitTimeMSec=1432689607801} 1 of 15: name=_6cth docCount=248744 codec=Lucene46 compound=false numFiles=11 size (MB)=86.542 diagnostics = {timestamp=1428883354605, os=Linux, os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge, lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation} has deletions [delGen=3174] test: open reader.FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs bits=156872 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372) Appreciate yout help, Guy.
Re: CheckIndex failed for Solr 4.7.2 index
IBM's J9 JVM unfortunately still has a number of nasty bugs affecting Lucene; most likely you are hitting one of these. We used to test J9 in our continuous Jenkins jobs, but there were just too many J9-specific failures and we couldn't get IBM's attention to resolve them, so we stopped. For now you should switch to Oracle JDK, or OpenJDK. But there's some good news! Recently, a member from the IBM JDK team replied to this Elasticsearch thread: https://discuss.elastic.co/t/need-help-with-ibm-jdk-issues-with-es-1-4-5/1748/3 And then Robert Muir ran Lucene's tests with the latest J9 and opened several issues; see the 2nd bullet under Apache Lucene at https://www.elastic.co/blog/this-week-in-elasticsearch-and-apache-lucene-2015-06-09 and at least one of the issues seems to be making progress (https://issues.apache.org/jira/browse/LUCENE-6522). So there is hope for the future, but for today it's too dangerous to use J9 with Lucene/Solr/Elasticsearch. Mike McCandless http://blog.mikemccandless.com On Tue, Jun 9, 2015 at 12:23 PM, Guy Moshkowich g...@il.ibm.com wrote: We are using Solr 4.7.2 and we found that when we run CheckIndex.checkIndex on one of the Solr shards we are getting the error below. Both replicas of the shard had the same error. The shard index looked healthy: 1) It appeared active in the Solr admin page. 2) We could run searches against it. 3) No relevant errors where found in Solr logs. 4) After we optimized the index in LUKE, CheckIndex did not report any error. My questions: 1) Is this is a real issue or a known bug in CheckIndex code that cause false negative ? 2) Is there a known fix for this issue? Here is the error we got: validateIndex Segments file=segments_bhe numSegments=15 version=4.7 format= userData={commitTimeMSec=1432689607801} 1 of 15: name=_6cth docCount=248744 codec=Lucene46 compound=false numFiles=11 size (MB)=86.542 diagnostics = {timestamp=1428883354605, os=Linux, os.version=2.6.32-431.23.3.el6.x86_64, mergeFactor=10, source=merge, lucene.version=4.7.2 1586229 - rmuir - 2014-04-10 09:00:35, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.7.0, java.vendor=IBM Corporation} has deletions [delGen=3174] test: open reader.FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: liveDocs count mismatch: info=156867, vs bits=156872 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372) Appreciate yout help, Guy.
CheckIndex question
Hi - with a corrupted core, 1. if I run CheckIndex with -fix, it will drop the hook to the corrupted segment, but the segment files are still there, when we have a lot of corrupted segments, we have to manually pick them out and remove them, is there a way the tool can suffix them or prefix them so it is easier to be cleaned out? 2. we know the doc count in the corrupted segment, is it easy also output the doc id on those docs? thanks Jie -- View this message in context: http://lucene.472066.n3.nabble.com/CheckIndex-question-tp4014366.html Sent from the Solr - User mailing list archive at Nabble.com.
checkindex
I've seen many mentions of the Lucene CheckIndex tool, but where can I find it? Is there any documentation on how to use it? I noticed Luke has it built-in, but I can't get Luke to open my index with the Don't open IndexReader(when opening corrupted index) option check. Opening even an index I know is valid doesn't work using this option: [cid:image001.png@01CA9085.38516930] [cid:image002.png@01CA9085.55D764F0]
Re: checkindex
When I needed to use it, I couldn't find docs for it either but it's straight forward. Here's what I did: un-jar the solr war file to find the lucene jar that solr was using and run CheckIndex like this java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /path/to/solr/data/index/ to actually *fix* the index, add the -fix argument java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex -fix /path/to/solr/data/index/ hope that helps, -Ian On 1/8/10 2:09 PM, Giovanni Fernandez-Kincade wrote: I've seen many mentions of the Lucene CheckIndex tool, but where can I find it? Is there any documentation on how to use it? I noticed Luke has it built-in, but I can't get Luke to open my index with the Don't open IndexReader(when opening corrupted index) option check. Opening even an index I know is valid doesn't work using this option: -- Ian Kallen blog: http://www.arachna.com/roller/spidaman tweetz: http://twitter.com/spidaman vox: 925.385.8426
RE: checkindex
Yeah that worked. Thanks! -Original Message- From: Ian Kallen [mailto:spidaman.l...@gmail.com] Sent: Friday, January 08, 2010 5:32 PM To: solr-user@lucene.apache.org Subject: Re: checkindex When I needed to use it, I couldn't find docs for it either but it's straight forward. Here's what I did: un-jar the solr war file to find the lucene jar that solr was using and run CheckIndex like this java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /path/to/solr/data/index/ to actually *fix* the index, add the -fix argument java -cp lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex -fix /path/to/solr/data/index/ hope that helps, -Ian On 1/8/10 2:09 PM, Giovanni Fernandez-Kincade wrote: I've seen many mentions of the Lucene CheckIndex tool, but where can I find it? Is there any documentation on how to use it? I noticed Luke has it built-in, but I can't get Luke to open my index with the Don't open IndexReader(when opening corrupted index) option check. Opening even an index I know is valid doesn't work using this option: -- Ian Kallen blog: http://www.arachna.com/roller/spidaman tweetz: http://twitter.com/spidaman vox: 925.385.8426