On 23.07.2014 13:38, Robert Muir wrote:
On Wed, Jul 23, 2014 at 7:29 AM, Harald Kirsch
<harald.kir...@raytion.com> wrote:


(As a side note: after truncating the file to the expected size+16, at least
the core starts up again. Have not tested anything else yet.)

After applying your truncation-fix, Is it possible for you to run the
checkindex tool (and show the output)?


The corrupted files (we found a second one) are *.fdt. They contain only
zero-bytes in their excess portion.

This is strange, I really want to know where these excess zeros are
coming from.  It really doesnt make a lot of sense, because these
zeros would have to be after the CRC checksum written at the very end
of every file... implying something crazy is happening beneath lucene.
But we cannot rule out the possibility of a lucene bug.

Can you explain anything more about your use case (e.g. do you have
something interesting going on like really massive documents) or
anything about your configuration of solr (e.g. are you using
mmapdirectory or nrtcachingdirectory)

Oh my, this is now embarrassing, but I have to tell the truth. This is not a Lucene issue, but a home-mode problem.

After indexing and before the load tests I wanted to estimate the latency distribution for major page faults. I used the code http://nyeggen.com/blog/2014/05/18/memory-mapping-%3E2gb-of-data-in-java/ to map files larger than 2GB, but did not watch out for two things in the code: a) mmap is invoked read/write and b) the file size is rounded upwards to full page. Since this was already last week, I realized too late that exactly the files I used as test files were corrupt.

Sorry for the generated noise and thanks for listening.

Harald.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to