The thing is, that I'm not running close to being out of memory. The data from nodetool info is showing that only about half of the available heap space is being used and running free from the command line shows that I have plenty of RAM available and some usage of the 1G swap space which is always on.
nodetool info: Load : 73.24 GB Generation No : 1271626230 Uptime (seconds) : 839414 Heap Memory (MB) : 2584.36 / 5461.38 free -m: total used free shared buffers cached Mem: 7680 7640 39 0 8 2364 -/+ buffers/cache: 5266 2413 Swap: 1023 388 635 Lee Parker On Wed, Apr 28, 2010 at 9:18 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > If you're running so close to the edge of running out of memory that > creating a ln process pushes you over the edge, you should fix the > broader cause instead of the specific symptom. :) > > On Tue, Apr 27, 2010 at 10:09 PM, Lee Parker <l...@socialagency.com> wrote: > > So, after reading the thread which Eric posted earlier, I have created a > > workaround for the issue. In my backup script, I add a swapfile with > > swapon, tell cassandra to create the snapshots, then remove the swapfile > > with swapoff. Then I continue with the rest of the work the backup > script > > needs to do in gathering up the snapshots into a tarball and pushing it > to > > S3. > > > > Lee Parker > > > > On Tue, Apr 27, 2010 at 9:01 PM, Lee Parker <l...@socialagency.com> > wrote: > >> > >> The system is a ubuntu server running 8.04 LTS. Now, I'm getting the > >> problem again this evening even with the addition of the swap space. > >> > >> Lee Parker > >> > >> On Tue, Apr 27, 2010 at 1:13 PM, Jonathan Shook <jsh...@gmail.com> > wrote: > >>> > >>> The allocation of memory may have failed depending on the available > >>> virtual memory, whether or not the memory would have been subsequently > >>> accessed by the process. Some systems do the work of allocating > physical > >>> pages only when they are accessed for the first time. I'm not sure if > yours > >>> is one of them. > >>> > >>> On Tue, Apr 27, 2010 at 10:45 AM, Lee Parker <l...@socialagency.com> > >>> wrote: > >>>> > >>>> Adding a swapfile fixed the error, but it doesn't look as though the > >>>> process is even using the swap file at all. > >>>> > >>>> Lee Parker > >>>> > >>>> On Tue, Apr 27, 2010 at 9:49 AM, Eric Hauser <ewhau...@gmail.com> > wrote: > >>>>> > >>>>> Have you read this? > >>>>> http://forums.sun.com/thread.jspa?messageID=9734530 > >>>>> I don't think EC2 instances have any swap. > >>>>> > >>>>> > >>>>> On Tue, Apr 27, 2010 at 10:16 AM, Lee Parker <l...@socialagency.com> > >>>>> wrote: > >>>>>> > >>>>>> Can anyone help with this? It is preventing me from getting backups > >>>>>> of our cluster. > >>>>>> > >>>>>> Lee Parker > >>>>>> > >>>>>> On Mon, Apr 26, 2010 at 10:02 PM, Lee Parker <l...@socialagency.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> I was attempting to get a snapshot on our cassandra nodes. I get > the > >>>>>>> following error every time I run nodetool ... snapshot. > >>>>>>> Exception in thread "main" java.io.IOException: Cannot run program > >>>>>>> "ln": java.io.IOException: error=12, Cannot allocate memory > >>>>>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) > >>>>>>> at > >>>>>>> > org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:221) > >>>>>>> at > >>>>>>> > org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1060) > >>>>>>> at org.apache.cassandra.db.Table.snapshot(Table.java:256) > >>>>>>> at > >>>>>>> > org.apache.cassandra.service.StorageService.takeAllSnapshot(StorageService.java:1005) > >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>>>>>> at > >>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>>>>>> at > >>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) > >>>>>>> at > >>>>>>> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) > >>>>>>> at > >>>>>>> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) > >>>>>>> at > >>>>>>> > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) > >>>>>>> at > com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) > >>>>>>> at > com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) > >>>>>>> at > >>>>>>> > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) > >>>>>>> at > >>>>>>> > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) > >>>>>>> at > >>>>>>> > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1426) > >>>>>>> at > >>>>>>> > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) > >>>>>>> at > >>>>>>> > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1264) > >>>>>>> at > >>>>>>> > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1359) > >>>>>>> at > >>>>>>> > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) > >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>>>>>> at > >>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>>>>>> at > >>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) > >>>>>>> at > >>>>>>> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > >>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:159) > >>>>>>> at java.security.AccessController.doPrivileged(Native Method) > >>>>>>> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > >>>>>>> at > >>>>>>> > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > >>>>>>> at > >>>>>>> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) > >>>>>>> at > >>>>>>> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) > >>>>>>> at > >>>>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >>>>>>> at > >>>>>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >>>>>>> at java.lang.Thread.run(Thread.java:619) > >>>>>>> Caused by: java.io.IOException: java.io.IOException: error=12, > Cannot > >>>>>>> allocate memory > >>>>>>> at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) > >>>>>>> at java.lang.ProcessImpl.start(ProcessImpl.java:65) > >>>>>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) > >>>>>>> ... 34 more > >>>>>>> The nodes are both Amazon EC2 Large instances with 7.5G RAM (6 > >>>>>>> allocated for Java heap) with two cores and only 70G of data in > casssandra. > >>>>>>> They have plenty of available RAM and HD space. Has anyone else > run into > >>>>>>> this error? > >>>>>>> > >>>>>>> Lee Parker > >>>>> > >>>> > >>> > >> > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >