[jira] [Commented] (CASSANDRA-5699) Streaming (2.0) can deadlock
[ https://issues.apache.org/jira/browse/CASSANDRA-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696170#comment-13696170 ] Jonathan Ellis commented on CASSANDRA-5699: --- Is the stream lifecycle documented anywhere the way we did in 1.2 StreamOut? NB: this patches StreamingRepairTask, which does not exist in current trunk. Streaming (2.0) can deadlock Key: CASSANDRA-5699 URL: https://issues.apache.org/jira/browse/CASSANDRA-5699 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.0 beta 1 Attachments: 5699.txt The new streaming implementation (CASSANDRA-5286) creates 2 threads per host for streaming, one for the incoming stream and one for the outgoing one. However, both currently share the same socket, but since we use synchronous I/O, a read can block a write, which can result in a deadlock if 2 nodes are both blocking on a read a the same time, thus blocking their respective writes (this is actually fairly easy to reproduce with a simple repair). So instead attaching a patch that uses one socket per thread. The patch also correct the stream throughput throttling calculation that was 8000 times lower than what it should be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[3/3] git commit: Merge branch 'cassandra-1.2' into trunk
Merge branch 'cassandra-1.2' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0888c2e1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0888c2e1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0888c2e1 Branch: refs/heads/trunk Commit: 0888c2e18722aacbd53c509655515a3f5bb6601a Parents: 66f3014 8c2a280 Author: Jonathan Ellis jbel...@apache.org Authored: Sat Jun 29 11:37:30 2013 -0700 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat Jun 29 11:37:30 2013 -0700 -- src/java/org/apache/cassandra/cli/CliMain.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0888c2e1/src/java/org/apache/cassandra/cli/CliMain.java --
[1/3] git commit: exit(1) on IOException trying to read from a file patch by Steve Peters; reviewed by jbellis for CASSANDRA-5247
Updated Branches: refs/heads/cassandra-1.2 d265fded8 - 8c2a28050 refs/heads/trunk 66f301457 - 0888c2e18 exit(1) on IOException trying to read from a file patch by Steve Peters; reviewed by jbellis for CASSANDRA-5247 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c2a2805 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c2a2805 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c2a2805 Branch: refs/heads/cassandra-1.2 Commit: 8c2a28050f67dbd9949d28a9e68a256d0658d036 Parents: d265fde Author: Jonathan Ellis jbel...@apache.org Authored: Sat Jun 29 11:36:41 2013 -0700 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat Jun 29 11:36:47 2013 -0700 -- src/java/org/apache/cassandra/cli/CliMain.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c2a2805/src/java/org/apache/cassandra/cli/CliMain.java -- diff --git a/src/java/org/apache/cassandra/cli/CliMain.java b/src/java/org/apache/cassandra/cli/CliMain.java index a347747..547f642 100644 --- a/src/java/org/apache/cassandra/cli/CliMain.java +++ b/src/java/org/apache/cassandra/cli/CliMain.java @@ -266,14 +266,14 @@ public class CliMain try { fileReader = new FileReader(sessionState.filename); +evaluateFileStatements(new BufferedReader(fileReader)); } catch (IOException e) { sessionState.err.println(e.getMessage()); -return; +System.exit(1); } -evaluateFileStatements(new BufferedReader(fileReader)); return; }
[2/3] git commit: exit(1) on IOException trying to read from a file patch by Steve Peters; reviewed by jbellis for CASSANDRA-5247
exit(1) on IOException trying to read from a file patch by Steve Peters; reviewed by jbellis for CASSANDRA-5247 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c2a2805 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c2a2805 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c2a2805 Branch: refs/heads/trunk Commit: 8c2a28050f67dbd9949d28a9e68a256d0658d036 Parents: d265fde Author: Jonathan Ellis jbel...@apache.org Authored: Sat Jun 29 11:36:41 2013 -0700 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat Jun 29 11:36:47 2013 -0700 -- src/java/org/apache/cassandra/cli/CliMain.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c2a2805/src/java/org/apache/cassandra/cli/CliMain.java -- diff --git a/src/java/org/apache/cassandra/cli/CliMain.java b/src/java/org/apache/cassandra/cli/CliMain.java index a347747..547f642 100644 --- a/src/java/org/apache/cassandra/cli/CliMain.java +++ b/src/java/org/apache/cassandra/cli/CliMain.java @@ -266,14 +266,14 @@ public class CliMain try { fileReader = new FileReader(sessionState.filename); +evaluateFileStatements(new BufferedReader(fileReader)); } catch (IOException e) { sessionState.err.println(e.getMessage()); -return; +System.exit(1); } -evaluateFileStatements(new BufferedReader(fileReader)); return; }
[jira] [Assigned] (CASSANDRA-5689) NPE shutting down Cassandra trunk (cassandra-1.2.5-989-g70dfb70)
[ https://issues.apache.org/jira/browse/CASSANDRA-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-5689: - Assignee: Ryan McGuire I don't see how this can happen since stop should only call close once: {code} public void stop() { if (isRunning.compareAndSet(true, false)) close(); } {code} Ryan, can one of your team reproduce? NPE shutting down Cassandra trunk (cassandra-1.2.5-989-g70dfb70) Key: CASSANDRA-5689 URL: https://issues.apache.org/jira/browse/CASSANDRA-5689 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 2.0 beta 1 Environment: Ubuntu Precise with Oracle Java 7u25. Reporter: Blair Zajac Assignee: Ryan McGuire Priority: Trivial I built Cassandra from git trunk at cassandra-1.2.5-989-g70dfb70 using the debian/ package. I have a shell script to shut down Cassandra: {code} $nodetool disablegossip sleep 5 $nodetool disablebinary $nodetool disablethrift $nodetool drain /etc/init.d/cassandra stop {code} Shutting it down I get this exception on all three nodes: {code} Exception in thread main java.lang.NullPointerException at org.apache.cassandra.transport.Server.close(Server.java:156) at org.apache.cassandra.transport.Server.stop(Server.java:107) at org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:347) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at java.security.AccessController.doPrivileged(Native Method) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please
[jira] [Updated] (CASSANDRA-5689) NPE shutting down Cassandra trunk (cassandra-1.2.5-989-g70dfb70)
[ https://issues.apache.org/jira/browse/CASSANDRA-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5689: -- Component/s: (was: Core) API Affects Version/s: (was: 2.0 beta 1) 1.2.0 NPE shutting down Cassandra trunk (cassandra-1.2.5-989-g70dfb70) Key: CASSANDRA-5689 URL: https://issues.apache.org/jira/browse/CASSANDRA-5689 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.2.0 Environment: Ubuntu Precise with Oracle Java 7u25. Reporter: Blair Zajac Assignee: Ryan McGuire Priority: Trivial I built Cassandra from git trunk at cassandra-1.2.5-989-g70dfb70 using the debian/ package. I have a shell script to shut down Cassandra: {code} $nodetool disablegossip sleep 5 $nodetool disablebinary $nodetool disablethrift $nodetool drain /etc/init.d/cassandra stop {code} Shutting it down I get this exception on all three nodes: {code} Exception in thread main java.lang.NullPointerException at org.apache.cassandra.transport.Server.close(Server.java:156) at org.apache.cassandra.transport.Server.stop(Server.java:107) at org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:347) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at java.security.AccessController.doPrivileged(Native Method) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696188#comment-13696188 ] Pavel Yaskevich commented on CASSANDRA-5661: I tried that before going with cached instances which is per CF mapint, queueByteBuffer as one CF could have different chunk sizes, it actually performs worse because of queue contention and we would still have to pay the price of open call on each read of file. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696192#comment-13696192 ] Jonathan Ellis commented on CASSANDRA-5661: --- We don't need exact chunk size matches though -- i.e., we can use a larger buffer. So if we just pool max chunk size buffers we'll probably come out ahead. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5504) Eternal iteration when using older hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696196#comment-13696196 ] Oleksandr Petrov commented on CASSANDRA-5504: - Can anyone confirm if 1.2.4 contains the fix, too? It seems to work, it's just not clear wether fix made it there or it's just a coincidence... Eternal iteration when using older hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.0 Reporter: Oleksandr Petrov Assignee: Oleksandr Petrov Priority: Minor Fix For: 1.2.5 Attachments: 5504-v3.txt, patch2.diff, patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5504) Eternal iteration when using older hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696196#comment-13696196 ] Oleksandr Petrov edited comment on CASSANDRA-5504 at 6/29/13 8:55 PM: -- Can anyone confirm if 1.2.4 contains the fix, too? It seems to work, it's just not clear wether fix made it there or it's just a coincidence... UPDATE: sorry, I've tested against 1.2.5, so nevermind :) was (Author: ifesdjeen): Can anyone confirm if 1.2.4 contains the fix, too? It seems to work, it's just not clear wether fix made it there or it's just a coincidence... Eternal iteration when using older hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.0 Reporter: Oleksandr Petrov Assignee: Oleksandr Petrov Priority: Minor Fix For: 1.2.5 Attachments: 5504-v3.txt, patch2.diff, patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696197#comment-13696197 ] Pavel Yaskevich commented on CASSANDRA-5661: It's waste of memory and doesn't solve contention problem. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696219#comment-13696219 ] Jonathan Ellis commented on CASSANDRA-5661: --- It's a lot less memory used than the status quo. I'd take a little contention over OOMing people. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696225#comment-13696225 ] Pavel Yaskevich commented on CASSANDRA-5661: I think we are trying solve the consequence instead of actual problem of adjusting max sstable size as jake pointed out, I think [~vijay2...@yahoo.com] was also doing so in production. Caching in the current state does the job for STCS and LCS with bigger files, expiring would be a good addition tho. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696233#comment-13696233 ] Jonathan Ellis commented on CASSANDRA-5661: --- As I explained, increasing sstable size helps but does not solve the problem; we're supposed to be supporting up to 5-10TB of data in 1.2. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696239#comment-13696239 ] Pavel Yaskevich commented on CASSANDRA-5661: What I am just trying to say is that expiring with global limit as good enough even for LCS with bigger files, but useless with 5MB besides all other problems. And as I pointed in on of the comment for 5-10 terabytes even with 128MB files are too small and affect system performance without taking into account indexing/bf overhead. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5466) Compaction task eats 100% CPU for a long time for tables with collection typed columns
[ https://issues.apache.org/jira/browse/CASSANDRA-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Zarutin updated CASSANDRA-5466: Assignee: Alex Zarutin (was: Ryan McGuire) Compaction task eats 100% CPU for a long time for tables with collection typed columns -- Key: CASSANDRA-5466 URL: https://issues.apache.org/jira/browse/CASSANDRA-5466 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.4 Environment: ubuntu 12.10, sun-6-java 1.6.0.37, Core-i7, 8GB RAM Reporter: Alexey Tereschenko Assignee: Alex Zarutin For the table: {code:sql} create table test ( user_id bigint, first_list listbigint, second_list listbigint, third_list listbigint, PRIMARY KEY (user_id) ); {code} I do thousands of updates like the following: {code:sql} UPDATE test SET first_list = [1], second_list = [2], third_list = [3] WHERE user_id = ?; {code} In several minutes a compaction task starts running. {{nodetool compactionstats}} shows that remaining time is 2 seconds but in fact it can take hours to really complete the compaction tasks. And during that time Cassandra consumes 100% of CPU and slows down so significally that it gives connection timeout exceptions to any client code trying to establish connection with Cassandra. This happens only with tables with collection typed columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5667) Change timestamps used in CAS ballot proposals to be more resilient to clock skew
[ https://issues.apache.org/jira/browse/CASSANDRA-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5667: -- Attachment: (was: 5667.txt) Change timestamps used in CAS ballot proposals to be more resilient to clock skew - Key: CASSANDRA-5667 URL: https://issues.apache.org/jira/browse/CASSANDRA-5667 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 2.0 beta 1 Environment: n/a Reporter: Nick Puz Assignee: Jonathan Ellis Priority: Minor Fix For: 2.0 beta 1 The current time is used to generate the timeuuid used for CAS ballots proposals with the logic that if a newer proposal exists then the current one needs to complete that and re-propose. The problem is that if a machine has clock skew and drifts into the future it will propose with a large timestamp (which will get accepted) but then subsequent proposals with lower (but correct) timestamps will not be able to proceed. This will prevent CAS write operations and also reads at serializable consistency level. The work around is to initially propose with current time (current behavior) but if the proposal fails due to a larger existing one re-propose (after completing the existing if necessary) with the max of (currentTime, mostRecent+1, proposed+1). Since small drift is normal between different nodes in the same datacenter this can happen even if NTP is working properly and a write hits one node and a subsequent serialized read hits another. In the case of NTP config issues (or OS bugs with time esp around DST) the unavailability window could be much larger. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5667) Change timestamps used in CAS ballot proposals to be more resilient to clock skew
[ https://issues.apache.org/jira/browse/CASSANDRA-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5667: -- Attachment: 5667.txt Patch attached to move contention retry into {{beginAndRepairPaxos}} and use max(current time from system clock, inProgress + 1) as the ballot. Also updates in_progress_ballot on commit if necessary to preserve the guarantee that we won't issue a promise for any ballot less than we've seen before. Change timestamps used in CAS ballot proposals to be more resilient to clock skew - Key: CASSANDRA-5667 URL: https://issues.apache.org/jira/browse/CASSANDRA-5667 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 2.0 beta 1 Environment: n/a Reporter: Nick Puz Assignee: Jonathan Ellis Priority: Minor Fix For: 2.0 beta 1 Attachments: 5667.txt The current time is used to generate the timeuuid used for CAS ballots proposals with the logic that if a newer proposal exists then the current one needs to complete that and re-propose. The problem is that if a machine has clock skew and drifts into the future it will propose with a large timestamp (which will get accepted) but then subsequent proposals with lower (but correct) timestamps will not be able to proceed. This will prevent CAS write operations and also reads at serializable consistency level. The work around is to initially propose with current time (current behavior) but if the proposal fails due to a larger existing one re-propose (after completing the existing if necessary) with the max of (currentTime, mostRecent+1, proposed+1). Since small drift is normal between different nodes in the same datacenter this can happen even if NTP is working properly and a write hits one node and a subsequent serialized read hits another. In the case of NTP config issues (or OS bugs with time esp around DST) the unavailability window could be much larger. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5667) Change timestamps used in CAS ballot proposals to be more resilient to clock skew
[ https://issues.apache.org/jira/browse/CASSANDRA-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5667: -- Attachment: 5667.txt Change timestamps used in CAS ballot proposals to be more resilient to clock skew - Key: CASSANDRA-5667 URL: https://issues.apache.org/jira/browse/CASSANDRA-5667 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 2.0 beta 1 Environment: n/a Reporter: Nick Puz Assignee: Jonathan Ellis Priority: Minor Fix For: 2.0 beta 1 Attachments: 5667.txt The current time is used to generate the timeuuid used for CAS ballots proposals with the logic that if a newer proposal exists then the current one needs to complete that and re-propose. The problem is that if a machine has clock skew and drifts into the future it will propose with a large timestamp (which will get accepted) but then subsequent proposals with lower (but correct) timestamps will not be able to proceed. This will prevent CAS write operations and also reads at serializable consistency level. The work around is to initially propose with current time (current behavior) but if the proposal fails due to a larger existing one re-propose (after completing the existing if necessary) with the max of (currentTime, mostRecent+1, proposed+1). Since small drift is normal between different nodes in the same datacenter this can happen even if NTP is working properly and a write hits one node and a subsequent serialized read hits another. In the case of NTP config issues (or OS bugs with time esp around DST) the unavailability window could be much larger. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CASSANDRA-5619) CAS UPDATE for a lost race: save round trip by returning column values
[ https://issues.apache.org/jira/browse/CASSANDRA-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reopened CASSANDRA-5619: --- This actually breaks the Thrift API; you can't return null from Thrift. Here's the generated code on the Python side; the Java code is similar: {code} if result.success is not None: return result.success if result.ire is not None: raise result.ire if result.ue is not None: raise result.ue if result.te is not None: raise result.te raise TApplicationException(TApplicationException.MISSING_RESULT, cas failed: unknown result); {code} Thus, a null/None result will result in the exception at the bottom being thrown. CAS UPDATE for a lost race: save round trip by returning column values -- Key: CASSANDRA-5619 URL: https://issues.apache.org/jira/browse/CASSANDRA-5619 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 2.0 beta 1 Reporter: Blair Zajac Assignee: Sylvain Lebresne Fix For: 2.0 beta 1 Attachments: 5619.txt Looking at the new CAS CQL3 support examples [1], if one lost a race for an UPDATE, to save a round trip to get the current values to decide if you need to perform your work, could the columns that were used in the IF clause also be returned to the caller? Maybe the columns values as part of the SET part could also be returned. I don't know if this is generally useful though. In the case of creating a new user account with a given username which is the partition key, if one lost the race to another person creating an account with the same username, it doesn't matter to the loser what the column values are, just that they lost. I'm new to Cassandra, so maybe there's other use cases, such as doing incremental amount of work on a row. In pure Java projects I've done while loops around AtomicReference.html#compareAndSet() until the work was done on the referenced object to handle multiple threads each making forward progress in updating the references object. [1] https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py#L3044 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696259#comment-13696259 ] Jonathan Ellis commented on CASSANDRA-5661: --- bq. What I am just trying to say is that expiring with global limit as good enough even for LCS with bigger files I don't see how that follows at all. The expiring approach is broken at any dataset size where memory pressure is a problem, since in the worst case it will not evict quickly enough. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5661) Discard pooled readers for cold data
[ https://issues.apache.org/jira/browse/CASSANDRA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696270#comment-13696270 ] Pavel Yaskevich commented on CASSANDRA-5661: Right, that's what max memory size cap is for, to make eviction more intelligent in times of memory pressure, and concurrency as limited so as dataset grows there wouldn't be a lot if items in each the queue anyway. Discard pooled readers for cold data Key: CASSANDRA-5661 URL: https://issues.apache.org/jira/browse/CASSANDRA-5661 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.1 Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.2.7 Attachments: DominatorTree.png, Histogram.png Reader pooling was introduced in CASSANDRA-4942 but pooled RandomAccessReaders are never cleaned up until the SSTableReader is closed. So memory use is the worst case simultaneous RAR we had open for this file, forever. We should introduce a global limit on how much memory to use for RAR, and evict old ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira