[jira] [Commented] (LUCENE-9428) merge index failed with checksum failed (hardware problem?)

2020-07-15 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158614#comment-17158614
 ] 

Robert Muir commented on LUCENE-9428:
-

I agree with Adrien, and when testing the RAM i recommend booting into 
something like memtest86 to do it. It is pretty thorough and will find problems 
that other checkers miss.

> merge index failed with checksum failed (hardware problem?)
> ---
>
> Key: LUCENE-9428
> URL: https://issues.apache.org/jira/browse/LUCENE-9428
> Project: Lucene - Core
>  Issue Type: Bug
> Environment: lucene version: 5.5.4
> jdk version : jdk1.8-1.8.0_231-fcs
> kernel version: kernel-3.10.0-957.21.3.el7.x86_64
>Reporter: AllenL
>Priority: Major
>
> Recently, a procedure using ElasticSearch appeared merge Index Failed with 
> the following exception information
>  
> {code:java}
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to merge
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to mergeorg.apache.lucene.index.CorruptIndexException: 
> checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) at 
> org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) 
> at 
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:333)
>  
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317)
>  
> at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) 
> at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) 
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) 
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) 
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) 
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
>  
> at 
> org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
>  
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
> [2020-07-03 13:37:34,203][WARN ][index.engine             ] [Deathbird] 
> [st-sess][4] failed engine [merge 
> failed]org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware 
> problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/shterm-17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1237)
>  
> at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){code}
>  
> The exception shows that it may be a hardware problem. Try to check the 
> hardware and find no exception. Check the command as follows:
>  # check device /dev/sda, /dev/sdb; but finds no hardware errors
>      using command: smartctl --xall /dev/sdx
>  # check message log /var/log/messages, no hardware problem happend
>  # The system has a state detection script, i get the system load recorded is 
> normal, IOwait is very low
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9428) merge index failed with checksum failed (hardware problem?)

2020-07-14 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157948#comment-17157948
 ] 

Adrien Grand commented on LUCENE-9428:
--

[~Lai_Ding] This exception occurs when the content of a file differs from the 
data that had been written. Unfortunately this is something that is very 
expensive to check, so Lucene doesn't keep verifying checksums continuously, it 
only does it at merge time since files need to be fully read anyway.

> merge index failed with checksum failed (hardware problem?)
> ---
>
> Key: LUCENE-9428
> URL: https://issues.apache.org/jira/browse/LUCENE-9428
> Project: Lucene - Core
>  Issue Type: Bug
> Environment: lucene version:5.5.4
> jdk version :jdk1.8-1.8.0_231-fcs
>Reporter: AllenL
>Priority: Major
>
> Recently, a procedure using ElasticSearch appeared merge Index Failed with 
> the following exception information
>  
> {code:java}
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to merge
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to mergeorg.apache.lucene.index.CorruptIndexException: 
> checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) at 
> org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) 
> at 
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:333)
>  
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317)
>  
> at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) 
> at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) 
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) 
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) 
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) 
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
>  
> at 
> org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
>  
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
> [2020-07-03 13:37:34,203][WARN ][index.engine             ] [Deathbird] 
> [st-sess][4] failed engine [merge 
> failed]org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware 
> problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/shterm-17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1237)
>  
> at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){code}
>  
> The exception shows that it may be a hardware problem. Try to check the 
> hardware and find no exception. Check the command as follows:
>  # check device /dev/sda, /dev/sdb; but finds no hardware errors
>      using command: smartctl --xall /dev/sdx
>  # check message log /var/log/messages, no hardware problem happend
>  # The system has a state detection script, i get the system load recorded is 
> normal, IOwait is very low
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9428) merge index failed with checksum failed (hardware problem?)

2020-07-14 Thread AllenL (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157942#comment-17157942
 ] 

AllenL commented on LUCENE-9428:


When will this exception occur

> merge index failed with checksum failed (hardware problem?)
> ---
>
> Key: LUCENE-9428
> URL: https://issues.apache.org/jira/browse/LUCENE-9428
> Project: Lucene - Core
>  Issue Type: Bug
> Environment: lucene version:5.5.4
> jdk version :jdk1.8-1.8.0_231-fcs
>Reporter: AllenL
>Priority: Major
>
> Recently, a procedure using ElasticSearch appeared merge Index Failed with 
> the following exception information
>  
> {code:java}
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to merge
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] 
> [st-sess][4] failed to mergeorg.apache.lucene.index.CorruptIndexException: 
> checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) at 
> org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) 
> at 
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:333)
>  
> at 
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317)
>  
> at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) 
> at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) 
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) 
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) 
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) 
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
>  
> at 
> org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
>  
> at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
> [2020-07-03 13:37:34,203][WARN ][index.engine             ] [Deathbird] 
> [st-sess][4] failed engine [merge 
> failed]org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware 
> problem?) : expected=31f090d9 actual=d9697caa 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/shterm-17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim")))
>  
> at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1237)
>  
> at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){code}
>  
> The exception shows that it may be a hardware problem. Try to check the 
> hardware and find no exception. Check the command as follows:
>  # check device /dev/sda, /dev/sdb; but finds no hardware errors
>      using command: smartctl --xall /dev/sdx
>  # check message log /var/log/messages, no hardware problem happend
>  # The system has a state detection script, i get the system load recorded is 
> normal, IOwait is very low
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org