One thing to remember is that the .index files are memory-mapped [1] which in Java means that the file descriptors may not be released even when the program is done using it. A garbage collection is expected to close such resources, but forcing a System.gc() is only a hint and thus doesn't guarantee that it will trigger the garbage collection and close that resource. More details in [2]. In order to confirm that this is what you are running into, can you print the output of the following:

lsof -p <broker-pid> | grep .deleted

Before running that command do wait for at least the retention timeout and file deletion delay to have actually passed. The interesting part in that output would be the "type" column. For example, in my case it shows DEL as the value for that column:

java 7518 jaikiran DEL REG 8,6 1453306 /tmp/kafka-logs/hey-0/00000000000000000000.index.deleted

when the index file has really been deleted but the JVM hasn't yet let go off the file descriptor resource. I did try to run a gc() from the jconsole MBean for that process, but that too didn't free this resource, but I didn't really expect it to, since there's not much control on GC itself.

By the way which exact vendor and version of Java do you use? [3] suggests that an update in Oracle Java 1.6 has a workaround to this problem, but that workaround only comes into picture when a subsequent FileChannel.map calls results in a OOM.

[1] http://docs.oracle.com/javase/7/docs/api/java/nio/MappedByteBuffer.html
[2] http://bugs.java.com/view_bug.do?bug_id=4724038
[3] http://bugs.java.com/view_bug.do?bug_id=6417205

-Jaikiran

On Monday 02 March 2015 10:30 AM, Guangle Fan wrote:
Slightly different from what I observed.

Broker box has 800GB disk space. By setting the appropriate log retention,
it's supposed to hold the log size. But then the usage of disk hits 90%,
and by doing nothing but restarting broker server. It free 40% disk space.
It's for sure the speed of the traffic won't be able to fill 40% disk space
within one minute period (log.delete.delay.ms).

The obvious change before and after restart broker server is broker server
frees tons of file descriptors of .index. Most of those file descriptors
are very old.

lsof -p  <broker_pid> | grep .deleted

Don't know how come kafka didn't release those file descriptors and what's
the dial of it.








On Sun, Mar 1, 2015 at 12:50 PM, Mayuresh Gharat <gharatmayures...@gmail.com
wrote:
Also I suppose when the broker starts up it will remove the files that are
marked with suffix .deleted and that's why you can see the free disk space
on restarting. Guozhang can correct me if I am wrong.

Thanks,

Mayuresh

On Sat, Feb 28, 2015 at 9:27 PM, Guozhang Wang <wangg...@gmail.com> wrote:

Guangle,

The deletion of the segment log / index files are async, i.e. when Kafka
decide to clean the logs, it only adds a suffix ".deleted" to the files
such that it will not be access any more by other Kafka threads. The
actual
file deletion will be executed later, with period controlled by "
file.delete.delay.ms" (default 1 minute).

On Fri, Feb 27, 2015 at 9:49 PM, Guangle Fan <fanguan...@gmail.com>
wrote:
Hi,

After Kafka cleaned .log / .index files based on topic retention. I can
still lsof a lot of .index.deleted files. And df shows usage on disk
space
is accumulated to full.

When this happened, just by restarting broker, it will immediately free
those disk space. I seems to me kafka after cleaning expired files
still
hold file descriptors which lead to disk space still being held.

How do you config kafka to let kafka release file descriptors in this
case
?

Using kafka 0.8.1.1

Regards,

Guangle



--
-- Guozhang



--
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Reply via email to