Hi,
Sometimes we get:
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885)
... 25 more
(excuse me, we lost full stack trace, it starts from java.io.IOException which
is caused by OOM)
As far as I understand it happens because of failed ‘mmap’ call during index
mapping:
OffsetIndex.scala:
val idx = raf.getChannel.map(FileChannel.MapMode.READ_WRITE, 0, len)
Probably it happens because of memory limit.
But after restart (and installing more memory, adjusting ulimits, etc.) we end
with long startup time. This IOException is not handled, active log files are
never flushed, some .index files maybe corrupted. It leads to startup times
(starting from 40min for us).
Is it possible to handle this exception, flush active log files, indexes and
exit properly? In fact it can take ‘infinite’ time to recover all things in
case of big number of topics/partitions.
Thanks,
Pavel.