I have a pre-production cluster with few data and similar problem...

PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
12916  0.0 80.0 5972756 3231120 ?     Sl   Oct18  15:49 /usr/bin/java -ea
-Xms1G -Xmx1G -XX:+UseParNewGC ...

Data dir:
2.2M    data/


Att,

Daniel Korndorfer
Telecom South America S/A
+55 (11) 4302-0188




On Sat, Dec 18, 2010 at 1:15 AM, Zhu Han <schumi....@gmail.com> wrote:

> Seems like  the problem there after I upgrade to "OpenJDK Runtime
> Environment (IcedTea6 1.9.2)". So it is not related to the bug I reported
> two days ago.
>
> Can somebody else share some info with us? What's the java environment you
> used? Is it stable for long-lived cassandra instances?
>
> best regards,
> hanzhu
>
>
> On Thu, Dec 16, 2010 at 9:28 PM, Zhu Han <schumi....@gmail.com> wrote:
>
>> I've tried it. But it does not work for me this afternoon.
>>
>> Thank you!
>>
>> best regards,
>> hanzhu
>>
>>
>>
>> On Thu, Dec 16, 2010 at 8:59 PM, Matthew Conway <m...@backupify.com>wrote:
>>
>>> Thanks for debugging this, I'm running into the same problem.
>>> BTW, if you can ssh into your nodes, you can use jconsole over ssh:
>>> http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html
>>>
>>> Matt
>>>
>>>
>>>
>>> On Dec 16, 2010, at Thu Dec 16, 2:39 AM, Zhu Han wrote:
>>>
>>> > Sorry for spam again. :-)
>>> >
>>> > I think I find the root cause. Here is a bug report[1] on memory leak
>>> of
>>> > ParNewGC.  It is solved by OpenJDK 1.6.0_20(IcedTea6 1.9.2)[2].
>>> >
>>> > So the suggestion is: for who runs cassandra  of Ubuntu 10.04, please
>>> > upgrade OpenJDK to the latest version.
>>> >
>>> > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824570
>>> > [2]
>>> http://blog.fuseyism.com/index.php/2010/09/10/icedtea6-19-released/
>>> >
>>> > best regards,
>>> > hanzhu
>>> >
>>> >
>>> > On Thu, Dec 16, 2010 at 3:10 PM, Zhu Han <schumi....@gmail.com> wrote:
>>> >
>>> >> The test node is behind a firewall. So I took some time to find a way
>>> to
>>> >> get JMX diagnostic information from it.
>>> >>
>>> >> What's interesting is, both the HeapMemoryUsage and NonHeapMemoryUsage
>>> >> reported by JVM is quite reasonable.  So, it's a myth why the JVM
>>> process
>>> >> maps such a big anonymous memory region...
>>> >>
>>> >> $ java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar -
>>> localhost:8080
>>> >> java.lang:type=Memory HeapMemoryUsage
>>> >> 12/16/2010 15:07:45 +0800 org.archive.jmx.Client HeapMemoryUsage:
>>> >> committed: 1065025536
>>> >> init: 1073741824
>>> >> max: 1065025536
>>> >> used: 18295328
>>> >>
>>> >> $java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar - localhost:8080
>>> >> java.lang:type=Memory NonHeapMemoryUsage
>>> >> 12/16/2010 15:01:51 +0800 org.archive.jmx.Client NonHeapMemoryUsage:
>>> >> committed: 34308096
>>> >> init: 24313856
>>> >> max: 226492416
>>> >> used: 21475376
>>> >>
>>> >> If anybody is interested in it, I can provide more diagnostic
>>> information
>>> >> before I restart the instance.
>>> >>
>>> >> best regards,
>>> >> hanzhu
>>> >>
>>> >>
>>> >>
>>> >> On Thu, Dec 16, 2010 at 1:00 PM, Zhu Han <schumi....@gmail.com>
>>> wrote:
>>> >>
>>> >>> After investigating it deeper,  I suspect it's native memory leak of
>>> JVM.
>>> >>> The large anonymous map on lower address space should be the native
>>> heap of
>>> >>> JVM,  but not java object heap.  Has anybody met it before?
>>> >>>
>>> >>> I'll try to upgrade the JVM tonight.
>>> >>>
>>> >>> best regards,
>>> >>> hanzhu
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Thu, Dec 16, 2010 at 10:50 AM, Zhu Han <schumi....@gmail.com>
>>> wrote:
>>> >>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> I have a test node with apache-cassandra-0.6.8 on ubuntu 10.4.  The
>>> >>>> hardware environment is an OpenVZ container. JVM settings is
>>> >>>> # java -Xmx128m -version
>>> >>>> java version "1.6.0_18"
>>> >>>> OpenJDK Runtime Environment (IcedTea6 1.8.2) (6b18-1.8.2-4ubuntu2)
>>> >>>> OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode)
>>> >>>>
>>> >>>> This is the memory settings:
>>> >>>>
>>> >>>> "/usr/bin/java -ea -Xms1G -Xmx1G ..."
>>> >>>>
>>> >>>> And the ondisk footprint of sstables is very small:
>>> >>>>
>>> >>>> "#du -sh data/
>>> >>>> "9.8M    data/"
>>> >>>>
>>> >>>> The node was infrequently accessed in the last  three weeks.  After
>>> that,
>>> >>>> I observe the abnormal memory utilization by top:
>>> >>>>
>>> >>>>  PID USER      PR  NI  *VIRT*  *RES*  SHR S %CPU %MEM    TIME+
>>> >>>> COMMAND
>>> >>>>
>>> >>>> 7836 root      15   0     *3300m* *2.4g*  13m S    0 26.0   2:58.51
>>> >>>> java
>>> >>>>
>>> >>>> The jvm heap utilization is quite normal:
>>> >>>>
>>> >>>> #sudo jstat -gc -J"-Xmx128m" 7836
>>> >>>> S0C    S1C    S0U    S1U      *EC*       *EU*          *OC*
>>> >>>> *OU*            *PC           PU*          YGC  YGCT  FGC    FGCT
>>> >>>> GCT
>>> >>>> 8512.0 8512.0 372.8   0.0   *68160.0*   *5225.7*   *963392.0
>>> 508200.7
>>> >>>> 30604.0 18373.4*    480    3.979      2      0.005    3.984
>>> >>>>
>>> >>>> And then I try "pmap" to see the native memory mapping. *There is
>>> two
>>> >>>> large anonymous mmap regions.*
>>> >>>>
>>> >>>> 00000000080dc000 1573568K rw---    [ anon ]
>>> >>>> 00002b2afc900000  1079180K rw---    [ anon ]
>>> >>>>
>>> >>>> The second one should be JVM heap.  What is the first one?  Mmap of
>>> >>>> sstable should never be anonymous mmap, but file based mmap.  *Is it
>>>  a
>>> >>>> native memory leak?  *Does cassandra allocate any DirectByteBuffer?
>>> >>>>
>>> >>>> best regards,
>>> >>>> hanzhu
>>> >>>>
>>> >>>
>>> >>>
>>> >>
>>>
>>>
>>
>

Reply via email to