[ https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827705#comment-13827705 ]
J. Ryan Earl commented on CASSANDRA-6275: ----------------------------------------- So we've been running this all night, have written a few hundred GB of data with some products we're developing, all the while OpsCenter 4.0.0 was doing TTL'd rollups and what not. Deleted file count remained at 1 the entire time, never increasing, and total file count remained below 1000. FYI, the single undeleted file looks like some temporary file randomly generated on Cassandra startup that gets deleted but not closed for the processes' duration, example: {noformat} java 1925 cassandra 44u REG 253,4 4096 13 /tmp/ffi441Hpl (deleted) {noformat} > 2.0.x leaks file handles > ------------------------ > > Key: CASSANDRA-6275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6275 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: java version "1.7.0_25" > Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) > Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT > 2012 x86_64 x86_64 x86_64 GNU/Linux > Reporter: Mikhail Mazursky > Assignee: graham sanderson > Fix For: 2.0.3 > > Attachments: 6275.txt, c_file-descriptors_strace.tbz, > cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz > > > Looks like C* is leaking file descriptors when doing lots of CAS operations. > {noformat} > $ sudo cat /proc/15455/limits > Limit Soft Limit Hard Limit Units > Max cpu time unlimited unlimited seconds > Max file size unlimited unlimited bytes > Max data size unlimited unlimited bytes > Max stack size 10485760 unlimited bytes > Max core file size 0 0 bytes > Max resident set unlimited unlimited bytes > Max processes 1024 unlimited processes > Max open files 4096 4096 files > Max locked memory unlimited unlimited bytes > Max address space unlimited unlimited bytes > Max file locks unlimited unlimited locks > Max pending signals 14633 14633 signals > Max msgqueue size 819200 819200 bytes > Max nice priority 0 0 > Max realtime priority 0 0 > Max realtime timeout unlimited unlimited us > {noformat} > Looks like the problem is not in limits. > Before load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 166 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 164 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 180 > {noformat} > After load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 967 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 1766 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 2578 > {noformat} > Most opened files have names like: > {noformat} > java 16890 cassandra 1636r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1637r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1638r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1639r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1640r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1641r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1642r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1643r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1644r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1645r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1646r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1647r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1648r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1649r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1650r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1651r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1652r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1653r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1654r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1655r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1656r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > {noformat} > Also, when that happens it's not always possible to shutdown server process > via SIGTERM. Have to use SIGKILL. > p.s. See mailing thread for more context information > https://www.mail-archive.com/user@cassandra.apache.org/msg33035.html -- This message was sent by Atlassian JIRA (v6.1#6144)