Re: Lucene index corruption on nodes restart

simonw Wed, 02 Apr 2014 05:01:41 -0700

hey,

is it possible to look at this index / shard? do you still have it / can 
you safe it for further investigations? You can ping me directly at simon 
AT elasticsearch DOT com


On Wednesday, April 2, 2014 11:23:38 AM UTC+2, Paweł Chabierski wrote:
>
> Few days ago we found we've got that same error when we search for data. 
>
> reason: "FetchPhaseExecutionException[[site_production][1]: 
> query[ConstantScore(cache(_type:ademail))],from[0],size[648]: Fetch Failed 
> [Failed to fetch doc id [9615533]]]; nested: EOFException[seek past EOF: 
> MMapIndexInput(path="/opt/elasticsearch/data/naprawa/nodes/0/indices/site_production/1/index/_573oa.fdt")];
>  
>
> After this error, elasticsearch stop search and don't return all results, 
> even if they are returned properly in other query. Is there any way to fix 
> this index? We try checkIndex from Luccene core library, after that we lost 
> ~20 milions of documents but error still occurs. Also we try to restore 
> index from snapshot, but error still occurs :/.
>
> W dniu sobota, 22 marca 2014 14:04:56 UTC+1 użytkownik Andrey Perminov 
> napisał:
>>
>> We are using a small elasticsearch cluster of three nodes, version 1.0.1. 
>> Each node has 7 GB RAM. Our software creates daily indexes for storing it's 
>> data. Daily index is something around 5 GB. Unfortunately, for a reason, 
>> Elasticsearch eats up all RAM and hangs the node, even though heap size is 
>> set to 6 GB max. So we decided to use monit to restart it on reaching 
>> memory limit of 90%. It works, but sometimes we got such errors:
>>
>> [2014-03-22 16:56:04,943][DEBUG][action.search.type       ] [es-00] 
>> [product-22-03-2014][0], node[jbUDVzuvS5GTM7iOG8iwzQ], [P], s[STARTED]: 
>> Failed to execute [org.elasticsearch.action.search.SearchRequest@687dc039]
>> org.elasticsearch.search.fetch.FetchPhaseExecutionException: 
>> [product-22-03-2014][0]: query[filtered(ToParentBlockJoinQuery 
>> (filtered(history.created:[1392574921000 TO 
>> *])->cache(_type:__history)))->cache(_type:product)],from[0],size[1000],sort[<custom:"history.created":
>>  
>> org.elasticsearch.index.search.nested.NestedFieldComparatorSource@15e4ece9>]:
>>  
>> Fetch Failed [Failed to fetch doc id [7263214]]
>>         at 
>> org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:230)
>>         at 
>> org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:156)
>>         at 
>> org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:332)
>>         at 
>> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:304)
>>         at 
>> org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
>>         at 
>> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
>>         at 
>> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4.run(TransportSearchTypeAction.java:292)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
>> Source)
>>         at java.lang.Thread.run(Unknown Source)
>> Caused by: java.io.EOFException: seek past EOF: 
>> MMapIndexInput(path="/opt/elasticsearch/main/nodes/0/indices/product-22-03-2014/0/index/_9lz.fdt")
>>         at 
>> org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:174)
>>         at 
>> org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:229)
>>         at 
>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:276)
>>         at 
>> org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
>>         at 
>> org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:196)
>>         at 
>> org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:228)
>>         ... 9 more
>> [2014-03-22 16:56:04,944][DEBUG][action.search.type       ] [es-00] All 
>> shards failed for phase: [query_fetch]
>>
>> According to our logs, this might happen when one or two nodes get 
>> restarted. More strangely, same shard got corrupted on all nodes of 
>> cluster. Why could this happen? How can we fix it? Can you suggest us how 
>> to fix memory usage?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b9d9aa67-5adc-4c06-8659-9031cb673d39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Lucene index corruption on nodes restart

Reply via email to