[jira] [Commented] (CASSANDRA-2116) Separate out filesystem errors from generic IOErrors
[ https://issues.apache.org/jira/browse/CASSANDRA-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035111#comment-13035111 ] Chris Goffinet commented on CASSANDRA-2116: --- Unfortunately the best we can get is IOError from Java. For example we use this patch to actually detect when our raid array dies, the OS will tell java to throw IOError. I think we should error on the side of, if data is corrupt, we should let the operator decide what mode he wants. For us, any errors or any corruption of data, we want to take the node out right away. We have been testing this in production for awhile and it works really well when disks die, and we also did tests involving removing drives from the system while it was serving traffic. The Read/Write classes was a similar idea of how the Hadoop code base handles this very issue. > Separate out filesystem errors from generic IOErrors > > > Key: CASSANDRA-2116 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2116 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Goffinet >Priority: Minor > Fix For: 1.0 > > Attachments: > 0001-Separate-out-filesystem-errors-from-generic-IOErrors.patch > > > We throw IOErrors everywhere today in the codebase. We should separate out > specific errors such as (reading, writing) from filesystem into FSReadError > and FSWriteError. This makes it possible in the next ticket to allow certain > failure modes (kill the server if reads or writes fail to disk). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2616) Add "DROP INDEX" command to CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035108#comment-13035108 ] Pavel Yaskevich commented on CASSANDRA-2616: Oh, I didn't know that Jackson created an issue about that (I was deleting those files because we ran into issue of indexes not being dropped properly). I will simplify everything then, thanks for pointing it out! > Add "DROP INDEX" command to CLI > --- > > Key: CASSANDRA-2616 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2616 > Project: Cassandra > Issue Type: New Feature >Reporter: Pavel Yaskevich >Assignee: Pavel Yaskevich > Fix For: 0.8.1 > > Attachments: CASSANDRA-2616.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2616) Add "DROP INDEX" command to CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035107#comment-13035107 ] Jonathan Ellis commented on CASSANDRA-2616: --- If you just take the index definition out of the metadata, Cassandra will do the right thing (and mark those sstables deleted). See CFS.reload / CFS.removeIndex (and CASSANDRA-2619 which fixed some bugs here). > Add "DROP INDEX" command to CLI > --- > > Key: CASSANDRA-2616 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2616 > Project: Cassandra > Issue Type: New Feature >Reporter: Pavel Yaskevich >Assignee: Pavel Yaskevich > Fix For: 0.8.1 > > Attachments: CASSANDRA-2616.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2641) AbstractBounds.normalize should deal with overlapping ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2641: -- Reviewer: slebresne Fix Version/s: (was: 1.0) 0.8.1 > AbstractBounds.normalize should deal with overlapping ranges > > > Key: CASSANDRA-2641 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2641 > Project: Cassandra > Issue Type: Test > Components: Core >Reporter: Stu Hood >Assignee: Stu Hood >Priority: Minor > Fix For: 0.8.1 > > Attachments: 0001-Assert-non-overlapping-ranges-in-normalize.txt, > 0002-Don-t-use-overlapping-ranges-in-tests.txt > > > Apparently no consumers have encountered it in production, but > AbstractBounds.normalize does not handle overlapping ranges. If given > overlapping ranges, the output will be sorted but still overlapping, for > which SSTableReader.getPositionsForRanges will choose ranges in an SSTable > that may overlap. > We should either add an assert in normalize(), or in getPositionsForRanges() > to ensure that this never bites us in production. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2116) Separate out filesystem errors from generic IOErrors
[ https://issues.apache.org/jira/browse/CASSANDRA-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035102#comment-13035102 ] Jonathan Ellis commented on CASSANDRA-2116: --- I'm not sure having different classes for read/write errors is necessary (code that is in a position to catch-and-do-something-reasonable knows what kind of op it's attempting). On the other hand, if a write op does a read as part of its implementation (indexes cause this to happen) we might need to distinguish the two. I think it's more useful to distinguish between recoverable errors and non-: "I got EOF earlier than I thought" usually means the file is corrupt not the disk is dead. (I can't think of any read errors that absolutely mean disk-is-dead.) It would be useful to get some use out of Java's misguided checked exceptions, by keeping recoverable errors checked (IOException) and unrecoverable ones unchecked (IOError). > Separate out filesystem errors from generic IOErrors > > > Key: CASSANDRA-2116 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2116 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Goffinet >Priority: Minor > Fix For: 1.0 > > Attachments: > 0001-Separate-out-filesystem-errors-from-generic-IOErrors.patch > > > We throw IOErrors everywhere today in the codebase. We should separate out > specific errors such as (reading, writing) from filesystem into FSReadError > and FSWriteError. This makes it possible in the next ticket to allow certain > failure modes (kill the server if reads or writes fail to disk). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2616) Add "DROP INDEX" command to CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035101#comment-13035101 ] Pavel Yaskevich commented on CASSANDRA-2616: Ok, should we delete SSTables used by index or let them be? > Add "DROP INDEX" command to CLI > --- > > Key: CASSANDRA-2616 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2616 > Project: Cassandra > Issue Type: New Feature >Reporter: Pavel Yaskevich >Assignee: Pavel Yaskevich > Fix For: 0.8.1 > > Attachments: CASSANDRA-2616.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2616) Add "DROP INDEX" command to CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035099#comment-13035099 ] Jonathan Ellis commented on CASSANDRA-2616: --- IMO we should do this at the client level (by creating appropriate metadata objects) not by adding a new thrift call. > Add "DROP INDEX" command to CLI > --- > > Key: CASSANDRA-2616 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2616 > Project: Cassandra > Issue Type: New Feature >Reporter: Pavel Yaskevich >Assignee: Pavel Yaskevich > Fix For: 0.8.1 > > Attachments: CASSANDRA-2616.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2644) Make bootstrap retry
[ https://issues.apache.org/jira/browse/CASSANDRA-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035098#comment-13035098 ] Jonathan Ellis commented on CASSANDRA-2644: --- bq. But there are still cases that retries will recover from... flapping/down nodes Fair enough, but increasing the timeout is still unwarranted. Let's just make it wait for max(DEFAULT_TIMEOUT, BOOTSTRAP_TIMEOUT) with B_T equal to, say, 30s. Committed patch 01 to 0.8.1 branch, btw. > Make bootstrap retry > > > Key: CASSANDRA-2644 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2644 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.8.0 beta 2 >Reporter: Chris Goffinet >Assignee: Chris Goffinet > Fix For: 0.8.1 > > Attachments: > 0001-Make-ExpiringMap-have-objects-with-specific-timeouts.patch, > 0002-Make-bootstrap-retry-and-increment-timeout-for-every.patch > > > We ran into a situation where we had rpc_timeout set to 1 second, and the > node needing to compute the token took over a second (1.6 seconds). The > bootstrapping node hangs forever without getting a token because the expiring > map removes it before the reply comes back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2481) C* .deb installs C* init.d scripts such that C* comes up before mdadm and related
[ https://issues.apache.org/jira/browse/CASSANDRA-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2481: -- Affects Version/s: (was: 0.7.0) Fix Version/s: 0.8.0 > C* .deb installs C* init.d scripts such that C* comes up before mdadm and > related > - > > Key: CASSANDRA-2481 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2481 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Matthew F. Dennis >Assignee: paul cannon >Priority: Minor > Fix For: 0.7.6, 0.8.0 > > Attachments: 2481.txt > > > the C* .deb packages install the init.d scripts at S20 which is before mdadm > and various other services. This means that when a node reboots that C* is > started before the RAID sets are up and mounted causing C* to think it has no > data and attempt bootstrapping again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1104598 - in /cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra: net/MessagingService.java utils/ExpiringMap.java
Author: jbellis Date: Tue May 17 22:14:35 2011 New Revision: 1104598 URL: http://svn.apache.org/viewvc?rev=1104598&view=rev Log: add per-callback timeouts to ExpiringMap Modified: cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/net/MessagingService.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/utils/ExpiringMap.java Modified: cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/net/MessagingService.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/net/MessagingService.java?rev=1104598&r1=1104597&r2=1104598&view=diff == --- cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/net/MessagingService.java (original) +++ cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/net/MessagingService.java Tue May 17 22:14:35 2011 @@ -83,6 +83,7 @@ public final class MessagingService impl private final SimpleCondition listenGate; private final Map droppedMessages = new EnumMap(StorageService.Verb.class); private final List subscribers = new ArrayList(); +private static final long DEFAULT_CALLBACK_TIMEOUT = (long) (1.1 * DatabaseDescriptor.getRpcTimeout()); { for (StorageService.Verb verb : StorageService.Verb.values()) @@ -121,7 +122,7 @@ public final class MessagingService impl return null; } }; -callbacks = new ExpiringMap>((long) (1.1 * DatabaseDescriptor.getRpcTimeout()), timeoutReporter); +callbacks = new ExpiringMap>(DEFAULT_CALLBACK_TIMEOUT, timeoutReporter); MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); try @@ -256,7 +257,12 @@ public final class MessagingService impl private void addCallback(IMessageCallback cb, String messageId, InetAddress to) { -Pair previous = callbacks.put(messageId, new Pair(to, cb)); +addCallback(cb, messageId, to, DEFAULT_CALLBACK_TIMEOUT); +} + +private void addCallback(IMessageCallback cb, String messageId, InetAddress to, long timeout) +{ +Pair previous = callbacks.put(messageId, new Pair(to, cb), timeout); assert previous == null; } @@ -267,6 +273,14 @@ public final class MessagingService impl return Integer.toString(idGen.incrementAndGet()); } +/* + * @see #sendRR(Message message, InetAddress to, IMessageCallback cb, long timeout) + */ +public String sendRR(Message message, InetAddress to, IMessageCallback cb) +{ +return sendRR(message, to, cb, DEFAULT_CALLBACK_TIMEOUT); +} + /** * Send a message to a given endpoint. This method specifies a callback * which is invoked with the actual response. @@ -275,12 +289,13 @@ public final class MessagingService impl * @param cb callback interface which is used to pass the responses or * suggest that a timeout occurred to the invoker of the send(). * suggest that a timeout occurred to the invoker of the send(). + * @param timeout the timeout used for expiration * @return an reference to message id used to match with the result */ -public String sendRR(Message message, InetAddress to, IMessageCallback cb) +public String sendRR(Message message, InetAddress to, IMessageCallback cb, long timeout) { String id = nextId(); -addCallback(cb, id, to); +addCallback(cb, id, to, timeout); sendOneWay(message, id, to); return id; } @@ -624,4 +639,9 @@ public final class MessagingService impl completedTasks.put(entry.getKey().getHostAddress(), entry.getValue().ackCon.getCompletedMesssages()); return completedTasks; } + +public static long getDefaultCallbackTimeout() +{ +return DEFAULT_CALLBACK_TIMEOUT; +} } Modified: cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/utils/ExpiringMap.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/utils/ExpiringMap.java?rev=1104598&r1=1104597&r2=1104598&view=diff == --- cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/utils/ExpiringMap.java (original) +++ cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/utils/ExpiringMap.java Tue May 17 22:14:35 2011 @@ -32,11 +32,13 @@ public class ExpiringMap { private final T value; private final long age; +private final long expiration; -CacheableObject(T o) +CacheableObject(T o, long e) { assert o != null; value = o; +expiration = e; age = System.currentTimeMillis(); } @@ -45,26 +47,21 @@ public class ExpiringMap return value;
svn commit: r1104597 - in /cassandra/branches/cassandra-0.8.1: ./ conf/ contrib/ debian/ doc/cql/ drivers/py/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/cli/
Author: jbellis Date: Tue May 17 22:12:59 2011 New Revision: 1104597 URL: http://svn.apache.org/viewvc?rev=1104597&view=rev Log: merge from 0.8 Removed: cassandra/branches/cassandra-0.8.1/doc/cql/CQL.html Modified: cassandra/branches/cassandra-0.8.1/ (props changed) cassandra/branches/cassandra-0.8.1/CHANGES.txt cassandra/branches/cassandra-0.8.1/NEWS.txt cassandra/branches/cassandra-0.8.1/build.xml cassandra/branches/cassandra-0.8.1/conf/cassandra.yaml cassandra/branches/cassandra-0.8.1/contrib/ (props changed) cassandra/branches/cassandra-0.8.1/debian/changelog cassandra/branches/cassandra-0.8.1/debian/init cassandra/branches/cassandra-0.8.1/debian/rules cassandra/branches/cassandra-0.8.1/doc/cql/CQL.textile cassandra/branches/cassandra-0.8.1/drivers/py/cqlsh cassandra/branches/cassandra-0.8.1/drivers/py/setup.py cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/cli/Cli.g cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/cli/CliClient.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/cql/Cql.g cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/ColumnFamilyStore.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/CompactionManager.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/HintedHandOffManager.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/io/sstable/SSTable.java cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java cassandra/branches/cassandra-0.8.1/src/resources/org/apache/cassandra/cli/CliHelp.yaml cassandra/branches/cassandra-0.8.1/test/unit/org/apache/cassandra/cli/CliTest.java Propchange: cassandra/branches/cassandra-0.8.1/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 22:12:59 2011 @@ -1,7 +1,7 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1081914,1083000 -/cassandra/branches/cassandra-0.7:1026516-1102046,1102337 +/cassandra/branches/cassandra-0.7:1026516-1104371 /cassandra/branches/cassandra-0.7.0:1053690-1055654 -/cassandra/branches/cassandra-0.8:1090935-1102339,1102345 +/cassandra/branches/cassandra-0.8:1090935-1104595 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3:774578-796573 /incubator/cassandra/branches/cassandra-0.4:810145-834239,834349-834350 Modified: cassandra/branches/cassandra-0.8.1/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8.1/CHANGES.txt?rev=1104597&r1=1104596&r2=1104597&view=diff == --- cassandra/branches/cassandra-0.8.1/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8.1/CHANGES.txt Tue May 17 22:12:59 2011 @@ -10,6 +10,14 @@ (CASSANDRA-2583) +0.8-? + * adjust hinted handoff page size to avoid OOM with large columns + (CASSANDRA-2652) + * update CQL consistency levels (CASSANDRA-2566) + * mark BRAF buffer invalid post-flush so we don't re-flush partial + buffers again, especially on CL writes (CASSANDRA-2660) + + 0.8.0-rc1 * faster flushes and compaction from fixing excessively pessimistic rebuffering in BRAF (CASSANDRA-2581) @@ -37,8 +45,10 @@ * initialize local ep state prior to gossip startup if needed (CASSANDRA-2638) * fix counter increment lost after restart (CASSANDRA-2642) * add quote-escaping via backslash to CLI (CASSANDRA-2623) - * fig pig example script (CASSANDRA-2487) + * fix pig example script (CASSANDRA-2487) * fix dynamic snitch race in adding latencies (CASSANDRA-2618) + * Start/stop cassandra after more important services such as mdadm in + debian packaging (CASSANDRA-2481) 0.8.0-beta2 Modified: cassandra/branches/cassandra-0.8.1/NEWS.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8.1/NEWS.txt?rev=1104597&r1=1104596&r2=1104597&view=diff == --- cassandra/branches/cassandra-0.8.1/NEWS.txt (original) +++ cassandra/branches/cassandra-0.8.1/NEWS.txt Tue May 17 22:12:59 2011 @@ -62,6 +62,15 @@
svn commit: r1104594 - in /cassandra/site: publish/download/index.html src/settings.py
Author: eevans Date: Tue May 17 22:08:32 2011 New Revision: 1104594 URL: http://svn.apache.org/viewvc?rev=1104594&view=rev Log: update site versioning for 0.8.0 rc1 release Modified: cassandra/site/publish/download/index.html cassandra/site/src/settings.py Modified: cassandra/site/publish/download/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1104594&r1=1104593&r2=1104594&view=diff == --- cassandra/site/publish/download/index.html (original) +++ cassandra/site/publish/download/index.html Tue May 17 22:08:32 2011 @@ -103,22 +103,22 @@ - The latest development release is 0.8.0-beta2 (released on - 2011-05-05). + The latest development release is 0.8.0-rc1 (released on + 2011-05-17). -http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-bin.tar.gz";>apache-cassandra-0.8.0-beta2-bin.tar.gz -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-bin.tar.gz.asc";>PGP] -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-bin.tar.gz.md5";>MD5] -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-bin.tar.gz.sha";>SHA1] +http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-bin.tar.gz";>apache-cassandra-0.8.0-rc1-bin.tar.gz +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-bin.tar.gz.asc";>PGP] +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-bin.tar.gz.md5";>MD5] +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-bin.tar.gz.sha";>SHA1] -http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-src.tar.gz";>apache-cassandra-0.8.0-beta2-src.tar.gz -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-src.tar.gz.asc";>PGP] -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-src.tar.gz.md5";>MD5] -[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-beta2-src.tar.gz.sha";>SHA1] +http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-src.tar.gz";>apache-cassandra-0.8.0-rc1-src.tar.gz +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-src.tar.gz.asc";>PGP] +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-src.tar.gz.md5";>MD5] +[http://www.apache.org/dist/cassandra/0.8.0/apache-cassandra-0.8.0-rc1-src.tar.gz.sha";>SHA1] Modified: cassandra/site/src/settings.py URL: http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1104594&r1=1104593&r2=1104594&view=diff == --- cassandra/site/src/settings.py (original) +++ cassandra/site/src/settings.py Tue May 17 22:08:32 2011 @@ -97,8 +97,8 @@ class CassandraDef(object): oldstable_exists = True stable_version = '0.7.5' stable_release_date = '2011-04-27' -devel_version = '0.8.0-beta2' -devel_release_date = '2011-05-05' +devel_version = '0.8.0-rc1' +devel_release_date = '2011-05-17' devel_exists = True _apache_base_url = 'http://www.apache.org' _svn_base_url = 'https://svn.apache.org/repos/asf'
svn commit: r1104576 - /cassandra/tags/cassandra-0.8.0-rc1/
Author: eevans Date: Tue May 17 21:40:35 2011 New Revision: 1104576 URL: http://svn.apache.org/viewvc?rev=1104576&view=rev Log: tagging 0.8.0 rc1 Added: cassandra/tags/cassandra-0.8.0-rc1/ - copied from r1102510, cassandra/branches/cassandra-0.8/
[jira] [Updated] (CASSANDRA-833) fix consistencylevel during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-833: - Attachment: 833-v2.txt v2 tweaks getWriteEndpoints to avoid new Collection creation where possible, instead using Iterables.concat. otherwise lgtm. > fix consistencylevel during bootstrap > - > > Key: CASSANDRA-833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-833 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 0.5 >Reporter: Jonathan Ellis >Assignee: Sylvain Lebresne > Fix For: 0.8.1 > > Attachments: 0001-Increase-CL-with-boostrapping-leaving-node.patch, > 833-v2.txt > > > As originally designed, bootstrap nodes should *always* get *all* writes > under any consistencylevel, so when bootstrap finishes the operator can run > cleanup on the old nodes w/o fear that he might lose data. > but if a bootstrap operation fails or is aborted, that means all writes will > fail until the ex-bootstrapping node is decommissioned. so starting in > CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations. > but this breaks the original design. CASSANDRA-822 adds a partial fix for > this (just adding bootstrap targets into the RF targets and hinting > normally), but this is still broken under certain conditions. The real fix > is to consider consistencylevel for two sets of nodes: > 1. the RF targets as currently existing (no pending ranges) > 2. the RF targets as they will exist after all movement ops are done > If we satisfy CL for both sets then we will always be in good shape. > I'm not sure if we can easily calculate 2. from the current TokenMetadata, > though. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.
[ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035053#comment-13035053 ] T Jake Luciani commented on CASSANDRA-2388: --- We need to return the list if replicas in the same DC > ColumnFamilyRecordReader fails for a given split because a host is down, even > if records could reasonably be read from other replica. > - > > Key: CASSANDRA-2388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2388 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Eldon Stegall > Labels: hadoop, inputformat > Fix For: 0.8.1 > > Attachments: 0002_On_TException_try_next_split.patch > > > ColumnFamilyRecordReader only tries the first location for a given split. We > should try multiple locations for a given split. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.
[ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035053#comment-13035053 ] T Jake Luciani edited comment on CASSANDRA-2388 at 5/17/11 9:09 PM: We need to return the list of replicas in the same DC was (Author: tjake): We need to return the list if replicas in the same DC > ColumnFamilyRecordReader fails for a given split because a host is down, even > if records could reasonably be read from other replica. > - > > Key: CASSANDRA-2388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2388 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Eldon Stegall > Labels: hadoop, inputformat > Fix For: 0.8.1 > > Attachments: 0002_On_TException_try_next_split.patch > > > ColumnFamilyRecordReader only tries the first location for a given split. We > should try multiple locations for a given split. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
[ https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035041#comment-13035041 ] Jonathan Ellis commented on CASSANDRA-2045: --- bq. Doesn't this mean that, given a very unstable cluster (e.g. EC2) writes using CL.ANY can cause nodes to fill up with data unexpectedly quickly? Sort of. It means you can fill up by at most 1/RF faster than you thought, yes, since rows can only be stored on at most once node that is not a replica (the coordinator). The correct fix to that is "stabilize your cluster." :) bq. It's probably a good idea to try to retain backwards compatibility here as much as possible so that rolling upgrades of a cluster is possible Right, but as discussed above we're not planning to move to materialized-hints entirely, so ripping out "classic" hints isn't an option anyway. bq. I think Edward's idea of storing hints in a per-node CommitLog is a pretty elegant solution, unfortunately it's quite a lot more invasive and would be a nightmare for maintaining backwards compatibility. serialized mutation objects as columns in a row is pretty close to commitlog format, only you can query it w/ normal tools. > Simplify HH to decrease read load when nodes come back > -- > > Key: CASSANDRA-2045 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2045 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Goffinet > Fix For: 1.0 > > > Currently when HH is enabled, hints are stored, and when a node comes back, > we begin sending that node data. We do a lookup on the local node for the row > to send. To help reduce read load (if a node is offline for long period of > time) we should store the data we want forward the node locally instead. We > wouldn't have to do any lookups, just take byte[] and send to the destination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2433: -- Reviewer: stuhood Component/s: Core Affects Version/s: (was: 0.7.4) > Failed Streams Break Repair > --- > > Key: CASSANDRA-2433 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benjamin Coverston >Assignee: Sylvain Lebresne > Labels: repair > Fix For: 0.8.1 > > Attachments: > 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch, > 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re.patch, > 0002-Register-in-gossip-to-handle-node-failures-v2.patch, > 0002-Register-in-gossip-to-handle-node-failures.patch, > 0003-Report-streaming-errors-back-to-repair-v2.patch, > 0003-Report-streaming-errors-back-to-repair.patch, > 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch, > 0004-Reports-validation-compaction-errors-back-to-repair.patch > > > Running repair in cases where a stream fails we are seeing multiple problems. > 1. Although retry is initiated and completes, the old stream doesn't seem to > clean itself up and repair hangs. > 2. The temp files are left behind and multiple failures can end up filling up > the data partition. > These issues together are making repair very difficult for nearly everyone > running repair on a non-trivial sized data set. > This issue is also being worked on w.r.t CASSANDRA-2088, however that was > moved to 0.8 for a few reasons. This ticket is to fix the immediate issues > that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2643) read repair/reconciliation breaks slice based iteration at QUORUM
[ https://issues.apache.org/jira/browse/CASSANDRA-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035030#comment-13035030 ] Peter Schuller commented on CASSANDRA-2643: --- You're right of course - my example was bogus. I'll also agree about re-try being reasonable under the circumstances, though perhaps not optimal. With regards to the fix. Let me just make sure I understand you correctly. So given a read command with a limit N that yields read repair/reconciliation breaks slice based iteration at QUORUM > - > > Key: CASSANDRA-2643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2643 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.7.5 >Reporter: Peter Schuller >Priority: Critical > Attachments: short_read.sh, slicetest.py > > > In short, I believe iterating over columns is impossible to do reliably with > QUORUM due to the way reconciliation works. > The problem is that the SliceQueryFilter is executing locally when reading on > a node, but no attempts seem to be made to consider limits when doing > reconciliation and/or read-repair (RowRepairResolver.resolveSuperset() and > ColumnFamily.resolve()). > If a node slices and comes up with 100 columns, and another node slices and > comes up with 100 columns, some of which are unique to each side, > reconciliation results in > 100 columns in the result set. In this case the > effect is limited to "client gets more than asked for", but the columns still > accurately represent the range. This is easily triggered by my test-case. > In addition to the client receiving "too many" columns, I believe some of > them will not be satisfying the QUORUM consistency level for the same reasons > as with deletions (see discussion below). > Now, there *should* be a problem for tombstones as well, but it's more > subtle. Suppose A has: > 1 > 2 > 3 > 4 > 5 > 6 > and B has: > 1 > del 2 > del 3 > del 4 > 5 > 6 > If you now slice 1-6 with count=3 the tombstones from B will reconcile with > those from A - fine. So you end up getting 1,5,6 back. This made it a bit > difficult to trigger in a test case until I realized what was going on. At > first I was "hoping" to see a "short" iteration result, which would mean that > the process of iterating until you get a short result will cause spurious > "end of columns" and thus make it impossible to iterate correctly. > So; due to 5-6 existing (and if they didn't, you legitimately reached > end-of-columns) we do indeed get a result of size 3 which contains 1,5 and 6. > However, only node B would have contributed columns 5 and 6; so there is > actually no QUORUM consistency on the co-ordinating node with respect to > these columns. If node A and C also had 5 and 6, they would not have been > considered. > Am I wrong? > In any case; using script I'm about to attach, you can trigger the > over-delivery case very easily: > (0) disable hinted hand-off to avoid that interacting with the test > (1) start three nodes > (2) create ks 'test' with rf=3 and cf 'slicetest' > (3) ./slicetest.py hostname_of_node_C insert # let it run for a few seconds, > then ctrl-c > (4) stop node A > (5) ./slicetest.py hostname_of_node_C insert # let it run for a few seconds, > then ctrl-c > (6) start node A, wait for B and C to consider it up > (7) ./slicetest.py hostname_of_node_A slice # make A co-ordinator though it > doesn't necessarily matter > You can also pass 'delete' (random deletion of 50% of contents) or > 'deleterange' (delete all in [0.2,0.8]) to slicetest, but you don't trigger a > short read by doing that (see discussion above). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2661) Canary CLHM v1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Manes updated CASSANDRA-2661: -- Attachment: clhm-20110517.jar Packaged at r628 > Canary CLHM v1.2 > > > Key: CASSANDRA-2661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2661 > Project: Cassandra > Issue Type: Task >Reporter: Benjamin Manes > Attachments: clhm-20110517.jar > > > I am hoping to release ConcurrentLinkedHashMap v1.2 by the end of the week. > This task is optional, but gives you the opportunity to canary the library > and provide any final feedback. There are currently 285 tests (some threaded) > plus a load test, so reliability-wise I'm fairly confident. > This release has numerous performance improvements. See the change log for > details. > It also includes a few useful features that may be of interest, > - Snapshot iteration in order of hotness (CASSANDRA-1966) > - Optionally defer LRU maintenance penalty to a background executor (instead > of amortized on caller threads) > http://code.google.com/p/concurrentlinkedhashmap/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2661) Canary CLHM v1.2
Canary CLHM v1.2 Key: CASSANDRA-2661 URL: https://issues.apache.org/jira/browse/CASSANDRA-2661 Project: Cassandra Issue Type: Task Reporter: Benjamin Manes I am hoping to release ConcurrentLinkedHashMap v1.2 by the end of the week. This task is optional, but gives you the opportunity to canary the library and provide any final feedback. There are currently 285 tests (some threaded) plus a load test, so reliability-wise I'm fairly confident. This release has numerous performance improvements. See the change log for details. It also includes a few useful features that may be of interest, - Snapshot iteration in order of hotness (CASSANDRA-1966) - Optionally defer LRU maintenance penalty to a background executor (instead of amortized on caller threads) http://code.google.com/p/concurrentlinkedhashmap/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2610) Have the repair of a range repair *all* the replica for that range
[ https://issues.apache.org/jira/browse/CASSANDRA-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2610: Attachment: 0001-Make-repair-repair-all-hosts.patch Patch against 0.8.1. It applies on top of CASSANDRA-2433 because it is changing enough of common code that I don't want to have to deal with the rebase back and forth (and it actually reuse some of the refactoring of CASSANDRA-2433 anyway) > Have the repair of a range repair *all* the replica for that range > -- > > Key: CASSANDRA-2610 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2610 > Project: Cassandra > Issue Type: Improvement > Components: Core >Affects Versions: 0.8 beta 1 >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Minor > Fix For: 0.8.1 > > Attachments: 0001-Make-repair-repair-all-hosts.patch > > Original Estimate: 8h > Remaining Estimate: 8h > > Say you have a range R whose replica for that range are A, B and C. If you > run repair on node A for that range R, when the repair end you only know that > A is fully repaired. B and C are not. That is B and C are up to date with A > before the repair, but are not up to date with one another. > It makes it a pain to schedule "optimal" cluster repairs, that is repairing a > full cluster without doing work twice (because you would have still have to > run a repair on B or C, which will make A, B and C redo a validation > compaction on R, and with more replica it's even more annoying). > However it is fairly easy during the first repair on A to have him compare > all the merkle trees, i.e the ones for B and C, and ask to B or C to stream > between them whichever the differences they have. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2433) Failed Streams Break Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2433: Attachment: 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch 0003-Report-streaming-errors-back-to-repair-v2.patch 0002-Register-in-gossip-to-handle-node-failures-v2.patch 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch Attaching rebased patch (against 0.8.1). It also change the behavior a little bit so as to not fail repair right away if a problem occur (it still throw an exception at the end if any problem had occured). It turns out to be slightly simpler that way. Especially for CASSANDRA-1610. > Failed Streams Break Repair > --- > > Key: CASSANDRA-2433 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2433 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.7.4 >Reporter: Benjamin Coverston >Assignee: Sylvain Lebresne > Labels: repair > Fix For: 0.8.1 > > Attachments: > 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re-v2.patch, > 0001-Put-repair-session-on-a-Stage-and-add-a-method-to-re.patch, > 0002-Register-in-gossip-to-handle-node-failures-v2.patch, > 0002-Register-in-gossip-to-handle-node-failures.patch, > 0003-Report-streaming-errors-back-to-repair-v2.patch, > 0003-Report-streaming-errors-back-to-repair.patch, > 0004-Reports-validation-compaction-errors-back-to-repair-v2.patch, > 0004-Reports-validation-compaction-errors-back-to-repair.patch > > > Running repair in cases where a stream fails we are seeing multiple problems. > 1. Although retry is initiated and completes, the old stream doesn't seem to > clean itself up and repair hangs. > 2. The temp files are left behind and multiple failures can end up filling up > the data partition. > These issues together are making repair very difficult for nearly everyone > running repair on a non-trivial sized data set. > This issue is also being worked on w.r.t CASSANDRA-2088, however that was > moved to 0.8 for a few reasons. This ticket is to fix the immediate issues > that we are seeing in 0.7. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
[ https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034970#comment-13034970 ] Nicholas Telford commented on CASSANDRA-2045: - I've been looking in to this and I have a few observations/questions, although I'm still quite new to the Cassandra codebase, so if I'm wrong, please let me know. * Currently, when a node receives a RowMutation containing a hint, it stores it to the application CF and places a hint in the system hints CF. This is fine in the general case, but writes using CL.ANY may result in hinted RowMutations being sent to nodes that don't own that key. They still write the RowMutation to their application CF so they can pass it on to the destination node when it recovers. But this data is only ever deleted during a manual cleanup. Doesn't this mean that, given a very unstable cluster (e.g. EC2) writes using CL.ANY can cause nodes to fill up with data unexpectedly quickly? * The JavaDoc for HintedHandOffManager mentions another issue caused by the current strategy: cleanup compactions on the application CF will cause the hints to become invalid. It goes on to suggest a strategy similar to what's being discussed here (placing the individual RowMutations in a separate HH CF). * It's probably a good idea to try to retain backwards compatibility here as much as possible so that rolling upgrades of a cluster is possible - hints stored for the old version need to be deliverable to nodes coming back up with the new version and vice versa. * I think Edward's idea of storing hints in a per-node CommitLog is a pretty elegant solution, unfortunately it's quite a lot more invasive and would be a nightmare for maintaining backwards compatibility. Thoughts? > Simplify HH to decrease read load when nodes come back > -- > > Key: CASSANDRA-2045 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2045 > Project: Cassandra > Issue Type: Improvement >Reporter: Chris Goffinet > Fix For: 1.0 > > > Currently when HH is enabled, hints are stored, and when a node comes back, > we begin sending that node data. We do a lookup on the local node for the row > to send. To help reduce read load (if a node is offline for long period of > time) we should store the data we want forward the node locally instead. We > wouldn't have to do any lookups, just take byte[] and send to the destination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-1278: - Assignee: Sylvain Lebresne (was: Matthew F. Dennis) I think we've been over-engineering the problem. Ed was on the right track: bq. I would personally like to see a JMX function like 'nodetool addsstable mykeyspace mycf mysstable-file' . Most people can generating and move an SSTable on their own (sstableWriter +scp) (This is, btw, the HBase bulk load approach, which despite some clunkiness does seem to solve the problem for those users.) The main drawback is that because of Cassandra's replication strategies, data from a naively-written sstable could span many nodes -- even the entire cluster. So we can improve the experience a lot with a simple tool that just streams ranges from a local table to the right nodes. Since it's doing the exact thing that existing node movement needs -- sending ranges from an existing sstable -- it should not require any new code from Streaming. Sylvain volunteered to take a stab at this. > Make bulk loading into Cassandra less crappy, more pluggable > > > Key: CASSANDRA-1278 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1278 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jeremy Hanna >Assignee: Sylvain Lebresne > Fix For: 0.8.1 > > Attachments: 1278-cassandra-0.7-v2.txt, 1278-cassandra-0.7.1.txt, > 1278-cassandra-0.7.txt > > Original Estimate: 40h > Time Spent: 40h 40m > Remaining Estimate: 0h > > Currently bulk loading into Cassandra is a black art. People are either > directed to just do it responsibly with thrift or a higher level client, or > they have to explore the contrib/bmt example - > http://wiki.apache.org/cassandra/BinaryMemtable That contrib module requires > delving into the code to find out how it works and then applying it to the > given problem. Using either method, the user also needs to keep in mind that > overloading the cluster is possible - which will hopefully be addressed in > CASSANDRA-685 > This improvement would be to create a contrib module or set of documents > dealing with bulk loading. Perhaps it could include code in the Core to make > it more pluggable for external clients of different types. > It is just that this is something that many that are new to Cassandra need to > do - bulk load their data into Cassandra. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of "Operations" by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Operations" page has been changed by JonathanEllis. The comment on this change is: alternating tokens is only viable w/ same number of nodes in each DC. http://wiki.apache.org/cassandra/Operations?action=diff&rev1=91&rev2=92 -- === Token selection === Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`. - With !NetworkTopologyStrategy, you should alternate data centers when assigning tokens. For example, with two nodes in each of two data centers, + With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly. Tokens still neded to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on. Thus, for a 4-node cluster in 2 datacenters, you would have + {{{ + DC1 + node 1 = 0 + node 2 = 85070591730234615865843651857942052864 + DC2 + node 3 = 1 + node 4 = 85070591730234615865843651857942052865 + }}} + + + If you happen to have the same number of nodes in each data center, you can also alternate data centers when assigning tokens: {{{ [DC1] node 1 = 0 [DC2] node 2 = 42535295865117307932921825928971026432 [DC1] node 3 = 85070591730234615865843651857942052864 [DC2] node 4 = 127605887595351923798765477786913079296 }}} + With order preserving partitioners, your key distribution will be application-dependent. You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots. Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over.
[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034878#comment-13034878 ] Ryan King commented on CASSANDRA-1610: -- Agreed. > Pluggable Compaction > > > Key: CASSANDRA-1610 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1610 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Alan Liang >Priority: Minor > Labels: compaction > Fix For: 1.0 > > Attachments: 0001-move-compaction-code-into-own-package.patch, > 0002-Pluggable-Compaction-and-Expiration.patch > > > In CASSANDRA-1608, I proposed some changes on how compaction works. I think > it also makes sense to allow the ability to have pluggable compaction per CF. > There could be many types of workloads where this makes sense. One example we > had at Digg was to completely throw away certain SSTables after N days. > The goal of this ticket is to make compaction pluggable enough to support > compaction based on max timestamp ordering of the sstables while satisfying > max sstable size, min and max compaction thresholds. Another goal is to allow > expiration of sstables based on a timestamp. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of "Operations" by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Operations" page has been changed by JonathanEllis. The comment on this change is: change NTS recommendation to alternate DCs. http://wiki.apache.org/cassandra/Operations?action=diff&rev1=90&rev2=91 -- === Token selection === Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`. - With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly. Tokens still neded to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on. Thus, for a 4-node cluster in 2 datacenters, you would have + With !NetworkTopologyStrategy, you should alternate data centers when assigning tokens. For example, with two nodes in each of two data centers, + {{{ - DC1 - node 1 = 0 + [DC1] node 1 = 0 + [DC2] node 2 = 42535295865117307932921825928971026432 - node 2 = 85070591730234615865843651857942052864 + [DC1] node 3 = 85070591730234615865843651857942052864 + [DC2] node 4 = 127605887595351923798765477786913079296 - - DC2 - node 1 = 1 - node 2 = 85070591730234615865843651857942052865 }}} - With order preserving partitioners, your key distribution will be application-dependent. You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots. Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over.
[jira] [Updated] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2660: -- Component/s: Core Priority: Minor (was: Major) Fix Version/s: 0.8.1 0.7.7 merged to 0.8 branch cleanly > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Peter Schuller >Assignee: Peter Schuller >Priority: Minor > Fix For: 0.7.7, 0.8.1 > > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1104381 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/io/util/
Author: jbellis Date: Tue May 17 16:17:55 2011 New Revision: 1104381 URL: http://svn.apache.org/viewvc?rev=1104381&view=rev Log: merge from 0.7 Modified: cassandra/branches/cassandra-0.8/ (props changed) cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/contrib/ (props changed) cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java Propchange: cassandra/branches/cassandra-0.8/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 16:17:55 2011 @@ -1,5 +1,5 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1081914,1083000 -/cassandra/branches/cassandra-0.7:1026516-1103894 +/cassandra/branches/cassandra-0.7:1026516-1104371 /cassandra/branches/cassandra-0.7.0:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /cassandra/trunk:1090978-1090979 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1104381&r1=1104380&r2=1104381&view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue May 17 16:17:55 2011 @@ -2,6 +2,8 @@ * adjust hinted handoff page size to avoid OOM with large columns (CASSANDRA-2652) * update CQL consistency levels (CASSANDRA-2566) + * mark BRAF buffer invalid post-flush so we don't re-flush partial + buffers again, especially on CL writes (CASSANDRA-2660) 0.8.0-rc1 Propchange: cassandra/branches/cassandra-0.8/contrib/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 16:17:55 2011 @@ -1,5 +1,5 @@ /cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009 -/cassandra/branches/cassandra-0.7/contrib:1026516-1103894 +/cassandra/branches/cassandra-0.7/contrib:1026516-1104371 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689 /cassandra/trunk/contrib:1090978-1090979 Propchange: cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 16:17:55 2011 @@ -1,5 +1,5 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1081914,1083000 -/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1103894 +/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1104371 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689 /cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090978-1090979 Propchange: cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 16:17:55 2011 @@ -1,5 +1,5 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1081914,1083000 -/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1103894 +/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1104371 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689 /cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090978-1090979 Pro
[jira] [Commented] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034833#comment-13034833 ] Hudson commented on CASSANDRA-2660: --- Integrated in Cassandra-0.7 #487 (See [https://builds.apache.org/hudson/job/Cassandra-0.7/487/]) mark BRAF buffer invalid post-flush so we don't re-flush partial buffers again patch by Peter Schuller; reviewed by jbellis for CASSANDRA-2660 > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Schuller >Assignee: Peter Schuller > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of "Operations" by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Operations" page has been changed by JonathanEllis. The comment on this change is: add NTS notes to token selection. http://wiki.apache.org/cassandra/Operations?action=diff&rev1=89&rev2=90 -- === Token selection === Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`. + With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly. Tokens still neded to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on. Thus, for a 4-node cluster in 2 datacenters, you would have + {{{ + DC1 + node 1 = 0 + node 2 = 85070591730234615865843651857942052864 + + DC2 + node 1 = 1 + node 2 = 85070591730234615865843651857942052865 + }}} + With order preserving partitioners, your key distribution will be application-dependent. You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots. Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over. @@ -40, +51 @@ Replication factor is not really intended to be changed in a live cluster either, but increasing it is conceptually simple: update the replication_factor from the CLI (see below), then run repair against each node in your cluster so that all the new replicas that are supposed to have the data, actually do. Until repair is finished, you have 3 options: + * read at ConsistencyLevel.QUORUM or ALL (depending on your existing replication factor) to make sure that a replica that actually has the data is consulted * continue reading at lower CL, accepting that some requests will fail (usually only the first for a given query, if ReadRepair is enabled) * take downtime while repair runs @@ -49, +61 @@ Reducing replication factor is easily done and only requires running cleanup afterwards to remove extra replicas. To update the replication factor on a live cluster, forget about cassandra.yaml. Rather you want to use '''cassandra-cli''': + - update keyspace Keyspace1 with replication_factor = 3; + . update keyspace Keyspace1 with replication_factor = 3; === Network topology === Besides datacenters, you can also tell Cassandra which nodes are in the same rack within a datacenter. Cassandra will use this to route both reads and data movement for Range changes to the nearest replicas. This is configured by a user-pluggable !EndpointSnitch class in the configuration file. @@ -97, +110 @@ Here's a python program which can be used to calculate new tokens for the nodes. There's more info on the subject at Ben Black's presentation at Cassandra Summit 2010. http://www.datastax.com/blog/slides-and-videos-cassandra-summit-2010 - def tokens(nodes): + . def tokens(nodes): - for x in xrange(nodes): + . for x in xrange(nodes): - print 2 ** 127 / nodes * x +. print 2 ** 127 / nodes * x - In versions of Cassandra 0.7.* and lower, there's also `nodetool loadbalance`: essentially a convenience over decommission + bootstrap, only instead of telling the target node where to move on the ring it will choose its location based on the same heuristic as Token selection on bootstrap. You should not use this as it doesn't rebalance the entire ring. + In versions of Cassandra 0.7.* and lower, there's also `nodetool loadbalance`: essentially a convenience over decommission + bootstrap, only instead of telling the target node where to move on the ring it will choose its location based on the same heuristic as Token selection on bootstrap. You should not use this as it doesn't rebalance the entire ring. - The status of move and balancing operations can be monitored using `nodetool` with the `netstat` argument. + The status of move and balancing operations can be monitored using `nodetool` with the `netstat` argument. (Cassandra 0.6.* and lower use the `streams` argument). - (Cassandra 0.6.* and lower use the `streams` argument). == Consistency == Cassandra allows clients to specify the desired consistency level on reads and writes. (See [[API]].) If R + W > N, where R, W, and N are respectively the read replica count, the write replica count, and the replication factor, all client reads will see the most recent write. Otherwise, readers '''may''' see older versions, for periods of typic
[jira] [Commented] (CASSANDRA-1610) Pluggable Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034812#comment-13034812 ] Jonathan Ellis commented on CASSANDRA-1610: --- Let's keep this to "make compaction pluggable" and add extra strategies separately. > Pluggable Compaction > > > Key: CASSANDRA-1610 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1610 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Chris Goffinet >Assignee: Alan Liang >Priority: Minor > Labels: compaction > Fix For: 1.0 > > Attachments: 0001-move-compaction-code-into-own-package.patch, > 0002-Pluggable-Compaction-and-Expiration.patch > > > In CASSANDRA-1608, I proposed some changes on how compaction works. I think > it also makes sense to allow the ability to have pluggable compaction per CF. > There could be many types of workloads where this makes sense. One example we > had at Digg was to completely throw away certain SSTables after N days. > The goal of this ticket is to make compaction pluggable enough to support > compaction based on max timestamp ordering of the sstables while satisfying > max sstable size, min and max compaction thresholds. Another goal is to allow > expiration of sstables based on a timestamp. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2660. --- Resolution: Fixed Reviewer: jbellis Ideally we wouldn't wipe out the buffer for read purposes but since we are mixing rw in the same buffer (see comments above) this is the best option. Committed. > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Schuller >Assignee: Peter Schuller > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1104305 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java
Author: jbellis Date: Tue May 17 14:53:26 2011 New Revision: 1104305 URL: http://svn.apache.org/viewvc?rev=1104305&view=rev Log: mark BRAF buffer invalid post-flush so we don't re-flush partial buffers again patch by Peter Schuller; reviewed by jbellis for CASSANDRA-2660 Modified: cassandra/branches/cassandra-0.7/CHANGES.txt cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java Modified: cassandra/branches/cassandra-0.7/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1104305&r1=1104304&r2=1104305&view=diff == --- cassandra/branches/cassandra-0.7/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7/CHANGES.txt Tue May 17 14:53:26 2011 @@ -1,6 +1,8 @@ 0.7.7 * adjust hinted handoff page size to avoid OOM with large columns (CASSANDRA-2652) + * mark BRAF buffer invalid post-flush so we don't re-flush partial + buffers again, especially on CL writes (CASSANDRA-2660) 0.7.6 Modified: cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java?rev=1104305&r1=1104304&r2=1104305&view=diff == --- cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java (original) +++ cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/io/util/BufferedRandomAccessFile.java Tue May 17 14:53:26 2011 @@ -128,6 +128,9 @@ public class BufferedRandomAccessFile ex fd = CLibrary.getfd(this.getFD()); } +/** + * Flush (flush()) whatever writes are pending, and block until the data has been persistently committed (fsync()). + */ public void sync() throws IOException { if (syncNeeded) @@ -150,6 +153,11 @@ public class BufferedRandomAccessFile ex } } +/** + * If we are dirty, flush dirty contents to the operating system. Does not imply fsync(). + * + * Currently, for implementation reasons, this also invalidates the buffer. + */ public void flush() throws IOException { if (isDirty) @@ -181,20 +189,25 @@ public class BufferedRandomAccessFile ex } +// Remember that we wrote, so we don't write it again on next flush(). +resetBuffer(); + isDirty = false; } } +private void resetBuffer() +{ +bufferOffset = current; +validBufferBytes = 0; +} + private void reBuffer() throws IOException { flush(); // synchronizing buffer and file on disk - -bufferOffset = current; +resetBuffer(); if (bufferOffset >= channel.size()) -{ -validBufferBytes = 0; return; -} if (bufferOffset < minBufferOffset) minBufferOffset = bufferOffset;
[jira] [Updated] (CASSANDRA-2659) Improve forceDeserialize/getCompactedRow encapsulation
[ https://issues.apache.org/jira/browse/CASSANDRA-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2659: -- Attachment: 2659-v2.txt v2 addresses nits and re-adds ability to use EchoedRow in multi-row compaction. (See comments to CC.getCompactedRow and forceDeserialize variable in CM.doCompactionWithoutSizeEstimation). > Improve forceDeserialize/getCompactedRow encapsulation > -- > > Key: CASSANDRA-2659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis >Priority: Minor > Fix For: 0.8.1 > > Attachments: 2659-v2.txt, 2659.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2547) CQL: support for "create columnfamily" option 'row_cache_provider'
[ https://issues.apache.org/jira/browse/CASSANDRA-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2547. --- Resolution: Duplicate Assignee: (was: Pavel Yaskevich) > CQL: support for "create columnfamily" option 'row_cache_provider' > -- > > Key: CASSANDRA-2547 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2547 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 0.8 beta 1 >Reporter: Cathy Daw >Priority: Minor > Labels: cql > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034773#comment-13034773 ] Jonathan Ellis commented on CASSANDRA-2660: --- bq. What are people's thoughts on creating separate abstractions for streaming I/O that can perhaps be a lot more simple, and use BRAF only for random reads in response to live traffic? (Not as part of this JIRA, just asking in general.) Every time I've looked at doing this I've put it aside because making all writes two-pass (first pass to compute size, so we don't have to seek back after serializing the row itself) is such a pain. > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Schuller >Assignee: Peter Schuller > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2394) Faulty hd kills cluster performance
[ https://issues.apache.org/jira/browse/CASSANDRA-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034771#comment-13034771 ] Jonathan Ellis commented on CASSANDRA-2394: --- Yes. Here's what the cli has to say about that: {noformat} Note that disabling read repair entirely means that the dynamic snitch will not have any latency information from all the replicas to recognize when one is performing worse than usual. {noformat} > Faulty hd kills cluster performance > --- > > Key: CASSANDRA-2394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2394 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.7.4 >Reporter: Thibaut >Priority: Minor > Fix For: 0.7.7 > > > Hi, > About every week, a node from our main cluster (>100 nodes) has a faulty hd > (Listing the cassandra data storage directoy triggers an input/output error). > Whenever this occurs, I see many timeoutexceptions in our application on > various nodes which cause everything to run very very slowly. Keyrange scans > just timeout and will sometimes never succeed. If I stop cassandra on the > faulty node, everything runs normal again. > It would be great to have some kind of monitoring thread in cassandra which > marks a node as "down" if there are multiple read/write errors to the data > directories. A single faulty hd on 1 node shouldn't affect global cluster > performance. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2547) CQL: support for "create columnfamily" option 'row_cache_provider'
[ https://issues.apache.org/jira/browse/CASSANDRA-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034736#comment-13034736 ] Pavel Yaskevich commented on CASSANDRA-2547: Branches cassandra-0.8/0.8.1 already have support for row_cache_provider in CreateColumnFamilyStatement.java, I have tested and it works. > CQL: support for "create columnfamily" option 'row_cache_provider' > -- > > Key: CASSANDRA-2547 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2547 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 0.8 beta 1 >Reporter: Cathy Daw >Assignee: Pavel Yaskevich >Priority: Minor > Labels: cql > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
BRAF.sync() bug can cause massive commit log write magnification Key: CASSANDRA-2660 URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Attachments: CASSANDRA-2660-075.txt This was discovered, fixed and tested on 0.7.5. Cursory examination shows it should still be an issue on trunk/0.8. If people otherwise agree with the patch I can rebase if necessary. Problem: BRAF.flush() is actually broken in the sense that it cannot be called without close co-operation with the caller. rebuffer() does the co-op by adjusting bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This means sync() is broken, and sync() is used by the commit log. The attached patch moves the bufferOffset/validateBufferBytes handling out into resetBuffer() and has both flush() and rebuffer() call that. This makes sync() safe. What happened was that for batched commit log mode, every time sync() was called the data buffered so far would get written to the OS and fsync():ed. But until rebuffer() is called for other reasons as part of the write path, all subsequent sync():s would result in the very same data (plus whatever was written since last time) being re-written and fsync():ed again. So first you write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until at some point you trigger a rebuffer() and it starts all over again. The result is that you see *a lot* more writes to the commit log than are in fact written to the BRAF. And these writes translate into actual real writes to the underlying storage device due to fsync(). We had crazy numbers where we saw spikes upwards of 80 mb/second where the actual throughput was more like ~ 1 mb second of data to the commit log. (One can make a possibly weak argument that it is also functionally incorrect as I can imagine implementations where re-writing the same blocks does copy-on-write in such a way that you're not necessarily guaranteed to see before-or-after data, particularly in case of partial page writes. However that's probably not a practical issue.) Worthy of noting is that this probably causes added difficulties in fsync() latencies since the average fsync() will contain a lot more data. Depending on I/O scheduler and underlying device characteristics, the extra writes *may* not have a detrimental effect, but I think it's pretty easy to point to cases where it will be detrimental - in particular if the commit log is on a non-battery backed drive. Even with a nice battery backed RAID with the commit log on, the size of the writes probably contributes to difficulty in making the write requests propagate down without being starved by reads (but this is speculation, not tested, other than that I've observed commit log writer starvation that seemed excessive). This isn't the first subtle BRAF bug. What are people's thoughts on creating separate abstractions for streaming I/O that can perhaps be a lot more simple, and use BRAF only for random reads in response to live traffic? (Not as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller reassigned CASSANDRA-2660: - Assignee: Peter Schuller > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Schuller >Assignee: Peter Schuller > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2660) BRAF.sync() bug can cause massive commit log write magnification
[ https://issues.apache.org/jira/browse/CASSANDRA-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-2660: -- Attachment: CASSANDRA-2660-075.txt > BRAF.sync() bug can cause massive commit log write magnification > > > Key: CASSANDRA-2660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2660 > Project: Cassandra > Issue Type: Bug >Reporter: Peter Schuller >Assignee: Peter Schuller > Attachments: CASSANDRA-2660-075.txt > > > This was discovered, fixed and tested on 0.7.5. Cursory examination shows it > should still be an issue on trunk/0.8. If people otherwise agree with the > patch I can rebase if necessary. > Problem: > BRAF.flush() is actually broken in the sense that it cannot be called without > close co-operation with the caller. rebuffer() does the co-op by adjusting > bufferOffset and validateBufferBytes appropriately, by sync() doesn't. This > means sync() is broken, and sync() is used by the commit log. > The attached patch moves the bufferOffset/validateBufferBytes handling out > into resetBuffer() and has both flush() and rebuffer() call that. This makes > sync() safe. > What happened was that for batched commit log mode, every time sync() was > called the data buffered so far would get written to the OS and fsync():ed. > But until rebuffer() is called for other reasons as part of the write path, > all subsequent sync():s would result in the very same data (plus whatever was > written since last time) being re-written and fsync():ed again. So first you > write+fsync N bytes, then N+N1, then N+N1+N2... (each N being a batch), until > at some point you trigger a rebuffer() and it starts all over again. > The result is that you see *a lot* more writes to the commit log than are in > fact written to the BRAF. And these writes translate into actual real writes > to the underlying storage device due to fsync(). We had crazy numbers where > we saw spikes upwards of 80 mb/second where the actual throughput was more > like ~ 1 mb second of data to the commit log. > (One can make a possibly weak argument that it is also functionally incorrect > as I can imagine implementations where re-writing the same blocks does > copy-on-write in such a way that you're not necessarily guaranteed to see > before-or-after data, particularly in case of partial page writes. However > that's probably not a practical issue.) > Worthy of noting is that this probably causes added difficulties in fsync() > latencies since the average fsync() will contain a lot more data. Depending > on I/O scheduler and underlying device characteristics, the extra writes > *may* not have a detrimental effect, but I think it's pretty easy to point to > cases where it will be detrimental - in particular if the commit log is on a > non-battery backed drive. Even with a nice battery backed RAID with the > commit log on, the size of the writes probably contributes to difficulty in > making the write requests propagate down without being starved by reads (but > this is speculation, not tested, other than that I've observed commit log > writer starvation that seemed excessive). > This isn't the first subtle BRAF bug. What are people's thoughts on creating > separate abstractions for streaming I/O that can perhaps be a lot more > simple, and use BRAF only for random reads in response to live traffic? (Not > as part of this JIRA, just asking in general.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2268) CQL-enabled stress.java
[ https://issues.apache.org/jira/browse/CASSANDRA-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034720#comment-13034720 ] Pavel Yaskevich commented on CASSANDRA-2268: Thanks! I'm stuck doing other CQL related stuff right now. > CQL-enabled stress.java > --- > > Key: CASSANDRA-2268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2268 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Eric Evans >Assignee: Aaron Morton >Priority: Minor > Labels: cql > Fix For: 0.8.1 > > > It would be great if stress.java had a CQL mode. For making the inevitable > RPC->CQL comparisons, but also as a basis for measuring optimizations, and > spotting performance regressions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2659) Improve forceDeserialize/getCompactedRow encapsulation
[ https://issues.apache.org/jira/browse/CASSANDRA-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034708#comment-13034708 ] Sylvain Lebresne commented on CASSANDRA-2659: - nitpicks: * the could remove the descriptor argument of the first getCompactedRow() and call needDeserialize() for the EchoedRow case. * we could use that first getCompactedRow() in SSTableWriter (it's really only cosmetic as we forceDesialize) * the comment of that first getCompactedRow() method is not completely correct, since the method may purge data (either if the sstable is of an old format or if forceDeserialize is set) while the comment suggest it never does it. but those are nitpicks, so with or without +1 > Improve forceDeserialize/getCompactedRow encapsulation > -- > > Key: CASSANDRA-2659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis >Priority: Minor > Fix For: 0.8.1 > > Attachments: 2659.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2394) Faulty hd kills cluster performance
[ https://issues.apache.org/jira/browse/CASSANDRA-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034671#comment-13034671 ] Thibaut commented on CASSANDRA-2394: I will do this next time and post the results. Could http://www.mail-archive.com/user@cassandra.apache.org/msg13407.html cause this? We are also using read repair 0. > Faulty hd kills cluster performance > --- > > Key: CASSANDRA-2394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2394 > Project: Cassandra > Issue Type: Bug >Affects Versions: 0.7.4 >Reporter: Thibaut >Priority: Minor > Fix For: 0.7.7 > > > Hi, > About every week, a node from our main cluster (>100 nodes) has a faulty hd > (Listing the cassandra data storage directoy triggers an input/output error). > Whenever this occurs, I see many timeoutexceptions in our application on > various nodes which cause everything to run very very slowly. Keyrange scans > just timeout and will sometimes never succeed. If I stop cassandra on the > faulty node, everything runs normal again. > It would be great to have some kind of monitoring thread in cassandra which > marks a node as "down" if there are multiple read/write errors to the data > directories. A single faulty hd on 1 node shouldn't affect global cluster > performance. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
buildbot success in ASF Buildbot on cassandra-trunk
The Buildbot has detected a restored build on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1317 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1104054 Blamelist: slebresne Build succeeded! sincerely, -The Buildbot
svn commit: r1104054 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/marshal/
Author: slebresne Date: Tue May 17 08:37:17 2011 New Revision: 1104054 URL: http://svn.apache.org/viewvc?rev=1104054&view=rev Log: merge from 0.8.1 Modified: cassandra/trunk/ (props changed) cassandra/trunk/contrib/ (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/src/java/org/apache/cassandra/db/marshal/ReversedType.java Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 08:37:17 2011 @@ -2,7 +2,7 @@ /cassandra/branches/cassandra-0.7:1026516-1102046,1102337 /cassandra/branches/cassandra-0.7.0:1053690-1055654 /cassandra/branches/cassandra-0.8:1090935-1102339,1102345 -/cassandra/branches/cassandra-0.8.1:1101014-1102517 +/cassandra/branches/cassandra-0.8.1:1101014-1102517,1104052 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3:774578-796573 /incubator/cassandra/branches/cassandra-0.4:810145-834239,834349-834350 Propchange: cassandra/trunk/contrib/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 08:37:17 2011 @@ -2,7 +2,7 @@ /cassandra/branches/cassandra-0.7/contrib:1026516-1102046,1102337 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654 /cassandra/branches/cassandra-0.8/contrib:1090935-1102339,1102345 -/cassandra/branches/cassandra-0.8.1/contrib:1101014-1102517 +/cassandra/branches/cassandra-0.8.1/contrib:1101014-1102517,1104052 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3/contrib:774578-796573 /incubator/cassandra/branches/cassandra-0.4/contrib:810145-810987,810994-834239,834349-834350 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 08:37:17 2011 @@ -2,7 +2,7 @@ /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1102046,1102337 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 /cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090935-1102339,1102345 -/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1102517 +/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1102517,1104052 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573 /incubator/cassandra/branches/cassandra-0.4/interface/gen-java/org/apache/cassandra/service/Cassandra.java:810145-834239,834349-834350 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Tue May 17 08:37:17 2011 @@ -2,7 +2,7 @@ /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1102046,1102337 /cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654 /cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090935-1102339,1102345 -/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1101014-1102517 +/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1101014-1102517,1104052 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/column_t.java:774578-792198 /incubator/cassandra/branches/cassandra-0.4/interface/gen-java/org/apache/cassandra/service/Column.java:810145-834239,834349-834350 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java ---
svn commit: r1104052 - /cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java
Author: slebresne Date: Tue May 17 08:36:02 2011 New Revision: 1104052 URL: http://svn.apache.org/viewvc?rev=1104052&view=rev Log: Add missing method for CASSANDRA-2355 Modified: cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java Modified: cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java?rev=1104052&r1=1104051&r2=1104052&view=diff == --- cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java (original) +++ cassandra/branches/cassandra-0.8.1/src/java/org/apache/cassandra/db/marshal/ReversedType.java Tue May 17 08:36:02 2011 @@ -21,6 +21,9 @@ package org.apache.cassandra.db.marshal; import java.nio.ByteBuffer; import java.util.HashMap; import java.util.Map; +import java.util.List; + +import org.apache.cassandra.config.ConfigurationException; public class ReversedType extends AbstractType { @@ -30,6 +33,14 @@ public class ReversedType extends Abs // package protected for unit tests sake final AbstractType baseType; +public static ReversedType getInstance(TypeParser parser) throws ConfigurationException +{ +List types = parser.getTypeParameters(); +if (types.size() != 1) +throw new ConfigurationException("ReversedType takes exactly one argument, " + types.size() + " given"); +return getInstance(types.get(0)); +} + public static synchronized ReversedType getInstance(AbstractType baseType) { ReversedType type = instances.get(baseType);