[jira] [Assigned] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter
[ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reassigned CASSANDRA-4338: -- Assignee: Marcus Eriksson (was: Aleksey Yeschenko) Experiment with direct buffer in SequentialWriter - Key: CASSANDRA-4338 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Marcus Eriksson Priority: Minor Fix For: 2.1 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: Update versions for 2.0.1 release
Updated Branches: refs/heads/cassandra-2.0 742f6a3e1 - eb96db6c1 Update versions for 2.0.1 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb96db6c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb96db6c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb96db6c Branch: refs/heads/cassandra-2.0 Commit: eb96db6c19515e6d1215230f29d25b46fcd005ef Parents: 742f6a3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Sep 19 13:48:30 2013 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Sep 20 12:19:34 2013 +0200 -- NEWS.txt | 2 +- build.xml| 2 +- debian/changelog | 6 ++ 3 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 3712073..fc257f4 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -13,7 +13,7 @@ restore snapshots created with the previous major version using the 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. -2.0.2 +2.0.1 = Upgrading http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/build.xml -- diff --git a/build.xml b/build.xml index 72291e2..5731540 100644 --- a/build.xml +++ b/build.xml @@ -25,7 +25,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information -- -property name=base.version value=2.0.0/ +property name=base.version value=2.0.1/ property name=scm.connection value=scm:git://git.apache.org/cassandra.git/ property name=scm.developerConnection value=scm:git://git.apache.org/cassandra.git/ property name=scm.url value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/ http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 9905726..61a91d7 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (2.0.1) unstable; urgency=low + + * New release + + -- Sylvain Lebresne slebre...@apache.org Thu, 19 Sep 2013 13:47:16 +0200 + cassandra (2.0.0) unstable; urgency=low * New release
Git Push Summary
Updated Tags: refs/tags/2.0.1-tentative [deleted] 72c50bd75
Git Push Summary
Updated Tags: refs/tags/2.0.1-tentative [created] eb96db6c1
[jira] [Created] (CASSANDRA-6070) Keep clients' remote addresses in ClientState
Aleksey Yeschenko created CASSANDRA-6070: Summary: Keep clients' remote addresses in ClientState Key: CASSANDRA-6070 URL: https://issues.apache.org/jira/browse/CASSANDRA-6070 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Minor Fix For: 2.0.2 Keep clients' remote addresses in ClientState -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-6070) Keep clients' remote addresses in ClientState
[ https://issues.apache.org/jira/browse/CASSANDRA-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-6070: - Attachment: 6070.txt Keep clients' remote addresses in ClientState - Key: CASSANDRA-6070 URL: https://issues.apache.org/jira/browse/CASSANDRA-6070 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Priority: Minor Fix For: 2.0.2 Attachments: 6070.txt Keep clients' remote addresses in ClientState -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-6037) Parens around WHERE condition break query
[ https://issues.apache.org/jira/browse/CASSANDRA-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6037: -- Reviewer: slebresne Parens around WHERE condition break query - Key: CASSANDRA-6037 URL: https://issues.apache.org/jira/browse/CASSANDRA-6037 Project: Cassandra Issue Type: Bug Environment: cqlsh, pdo_cassandra Reporter: Sergey Nagaytsev Assignee: Dave Brosius Priority: Minor Labels: cql3 Fix For: 1.2.11 Attachments: 6037.txt SELECT * FROM user WHERE (key=UUID); Bad Request: line 1:25 no viable alternative at input '(' SELECT * FROM user WHERE key=UUID; -- No parens -- Normal output The example provided is minimal, bug was discovered with AND logic on indexed columns. Parens-enclosed conditions is good SQL and so is produced by database abstraction layers in complex queries to avoid operation precedence problems. Fixing this at application side is no option - this will open the can of logic bugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037
Updated Branches: refs/heads/cassandra-1.2 df046d6b4 - a0fa69715 allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a0fa6971 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a0fa6971 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a0fa6971 Branch: refs/heads/cassandra-1.2 Commit: a0fa69715f7913804fbd55c1280e0d35edd3bf0f Parents: df046d6 Author: Dave Brosius dbros...@apache.org Authored: Fri Sep 20 10:47:36 2013 -0400 Committer: Dave Brosius dbros...@apache.org Committed: Fri Sep 20 10:47:36 2013 -0400 -- src/java/org/apache/cassandra/cql3/Cql.g | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0fa6971/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 2445bf2..7101c71 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -791,6 +791,7 @@ relation[ListRelation clauses] } | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); } '(' ( f1=term { rel.addInValue(f1); } (',' fN=term { rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); } +| '(' relation[$clauses] ')' ; comparatorType returns [CQL3Type t]
[jira] [Resolved] (CASSANDRA-6037) Parens around WHERE condition break query
[ https://issues.apache.org/jira/browse/CASSANDRA-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Brosius resolved CASSANDRA-6037. - Resolution: Fixed committed as a0fa69715f7913804fbd55c1280e0d35edd3bf0f to cassandra-1.2 Parens around WHERE condition break query - Key: CASSANDRA-6037 URL: https://issues.apache.org/jira/browse/CASSANDRA-6037 Project: Cassandra Issue Type: Bug Environment: cqlsh, pdo_cassandra Reporter: Sergey Nagaytsev Assignee: Dave Brosius Priority: Minor Labels: cql3 Fix For: 1.2.11 Attachments: 6037.txt SELECT * FROM user WHERE (key=UUID); Bad Request: line 1:25 no viable alternative at input '(' SELECT * FROM user WHERE key=UUID; -- No parens -- Normal output The example provided is minimal, bug was discovered with AND logic on indexed columns. Parens-enclosed conditions is good SQL and so is produced by database abstraction layers in complex queries to avoid operation precedence problems. Fixing this at application side is no option - this will open the can of logic bugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[1/2] git commit: allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037
Updated Branches: refs/heads/cassandra-2.0 eb96db6c1 - bcb4da739 allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a0fa6971 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a0fa6971 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a0fa6971 Branch: refs/heads/cassandra-2.0 Commit: a0fa69715f7913804fbd55c1280e0d35edd3bf0f Parents: df046d6 Author: Dave Brosius dbros...@apache.org Authored: Fri Sep 20 10:47:36 2013 -0400 Committer: Dave Brosius dbros...@apache.org Committed: Fri Sep 20 10:47:36 2013 -0400 -- src/java/org/apache/cassandra/cql3/Cql.g | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0fa6971/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 2445bf2..7101c71 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -791,6 +791,7 @@ relation[ListRelation clauses] } | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); } '(' ( f1=term { rel.addInValue(f1); } (',' fN=term { rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); } +| '(' relation[$clauses] ')' ; comparatorType returns [CQL3Type t]
[2/2] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bcb4da73 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bcb4da73 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bcb4da73 Branch: refs/heads/cassandra-2.0 Commit: bcb4da739cc6a0fdb83f49772dc0de1659bc8ced Parents: eb96db6 a0fa697 Author: Dave Brosius dbros...@apache.org Authored: Fri Sep 20 10:49:10 2013 -0400 Committer: Dave Brosius dbros...@apache.org Committed: Fri Sep 20 10:49:10 2013 -0400 -- src/java/org/apache/cassandra/cql3/Cql.g | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bcb4da73/src/java/org/apache/cassandra/cql3/Cql.g -- diff --cc src/java/org/apache/cassandra/cql3/Cql.g index 6fb0db4,7101c71..17afb00 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@@ -869,10 -789,9 +869,11 @@@ relation[ListRelation clauses for (ColumnIdentifier id : l) $clauses.add(new Relation(id, type, t, true)); } +| name=cident K_IN { Term.Raw marker = null; } (QMARK { marker = newINBindVariables(null); } | ':' mid=cident { marker = newINBindVariables(mid); }) +{ $clauses.add(new Relation(name, Relation.Type.IN, marker)); } | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); } '(' ( f1=term { rel.addInValue(f1); } (',' fN=term { rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); } + | '(' relation[$clauses] ')' ; comparatorType returns [CQL3Type t]
[jira] [Commented] (CASSANDRA-6037) Parens around WHERE condition break query
[ https://issues.apache.org/jira/browse/CASSANDRA-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773029#comment-13773029 ] Sylvain Lebresne commented on CASSANDRA-6037: - +1 Parens around WHERE condition break query - Key: CASSANDRA-6037 URL: https://issues.apache.org/jira/browse/CASSANDRA-6037 Project: Cassandra Issue Type: Bug Environment: cqlsh, pdo_cassandra Reporter: Sergey Nagaytsev Assignee: Dave Brosius Priority: Minor Labels: cql3 Fix For: 1.2.11 Attachments: 6037.txt SELECT * FROM user WHERE (key=UUID); Bad Request: line 1:25 no viable alternative at input '(' SELECT * FROM user WHERE key=UUID; -- No parens -- Normal output The example provided is minimal, bug was discovered with AND logic on indexed columns. Parens-enclosed conditions is good SQL and so is produced by database abstraction layers in complex queries to avoid operation precedence problems. Fixing this at application side is no option - this will open the can of logic bugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4430) optional pluggable o.a.c.metrics reporters
[ https://issues.apache.org/jira/browse/CASSANDRA-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773056#comment-13773056 ] Jonathan Ellis commented on CASSANDRA-4430: --- bq. I think if those deps are excluded the only consequence is less useful error messages. Let's verify that, because I really don't want to increase our list of hand-maintained jars and licenses (which appear to be missing, btw) by 10% for a very optional feature. Especially when half of the four new ones are not final releases... optional pluggable o.a.c.metrics reporters -- Key: CASSANDRA-4430 URL: https://issues.apache.org/jira/browse/CASSANDRA-4430 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Burroughs Assignee: Chris Burroughs Priority: Minor Fix For: 2.1 Attachments: 4430-2.0.txt, 4430-trunk.txt, cassandra-ganglia-example.png CASSANDRA-4009 expanded the use of the metrics library which has a set of reporter modules http://metrics.codahale.com/manual/core/#reporters You can report to flat files, ganglia, spit everything over http, etc. The next step is a mechanism for using those reporters with o.a.c.metrics. To avoid bundling everything I suggest following the mx4j approach of enable only if on classpath coupled with a reporter configuration file. Strawman file: {noformat} console: time: 1 timeunit: seconds csv: - time: 1 timeunit: minutes file: foo.csv - time: 10 timeunit: seconds file: bar.csv ganglia: - time: 30 timunit: seconds host: server-1 port: 8649 - time: 30 timunit: seconds host: server-2 port: 8649 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1158) Update the write/read paths to include KeyCache and RowCache
[ https://issues.apache.org/jira/browse/CASSANDRA-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1158: -- Assignee: Tyler Hobbs Update the write/read paths to include KeyCache and RowCache Key: CASSANDRA-1158 URL: https://issues.apache.org/jira/browse/CASSANDRA-1158 Project: Cassandra Issue Type: Improvement Components: Documentation website Affects Versions: 0.6 Reporter: Michael Merickel Assignee: Tyler Hobbs Priority: Trivial Right now the caching is somewhat of a mystery - it'd be useful if the wiki was updated to illustrate the impact of the different caches, when they're updated/invalidated, etc. Also the WIKI currently references the old method (0.5) for setting KeysCachedFraction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1158) Update the write/read paths to include KeyCache and RowCache
[ https://issues.apache.org/jira/browse/CASSANDRA-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773101#comment-13773101 ] Jonathan Ellis commented on CASSANDRA-1158: --- Updating the read and write paths is probably overdue in general now. Update the write/read paths to include KeyCache and RowCache Key: CASSANDRA-1158 URL: https://issues.apache.org/jira/browse/CASSANDRA-1158 Project: Cassandra Issue Type: Improvement Components: Documentation website Affects Versions: 0.6 Reporter: Michael Merickel Assignee: Tyler Hobbs Priority: Trivial Right now the caching is somewhat of a mystery - it'd be useful if the wiki was updated to illustrate the impact of the different caches, when they're updated/invalidated, etc. Also the WIKI currently references the old method (0.5) for setting KeysCachedFraction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-674. -- Resolution: Won't Fix New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Attachments: 674-v1.diff, 674-v2.tgz, 674-v3.tgz, 674-ycsb.log, trunk-ycsb.log Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. This v2 implementation is not ready for serious use: see comments for remaining issues. It is roughly the format described here: http://wiki.apache.org/cassandra/FileFormatDesignDoc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5590) User defined types for CQL3
[ https://issues.apache.org/jira/browse/CASSANDRA-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773088#comment-13773088 ] Aleksey Yeschenko commented on CASSANDRA-5590: -- Re: reserved types. Suggesting the following list, or some subset of it: - byte - smallint (2bytes) - complex - enum - money - date (just the date, no time, 4 bytes) - interval (time) - macaddr - bitstring Geometric types, if we ever decide to go that way: - point - line - lseg - box - path - polygon - circle User defined types for CQL3 --- Key: CASSANDRA-5590 URL: https://issues.apache.org/jira/browse/CASSANDRA-5590 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.1 A typical use case for a collection could be to store a bunch of addresses in a user profile. An address could typically be composed of a few properties: say a street, a city, a postal code and maybe a few phone numbers associated to it. To model that currently with collections, you might use a {{mapstring, blob}}, where the map key could be a string identifying the address, and the value would be all the infos of an address serialized manually (you can use {{text}} instead of {{blob}} and shove everything in a string if you prefer but the principle is the same). This ticket suggests to make this more user friendly by allowing: {noformat} CREATE TYPE address ( street text, city text, zip_code int, phones settext ) CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses mapstring, address ) {noformat} Under the hood, that type declaration would just be metadata on top of CompositeType (which does mean a limitation would be that we wouldn't allow re-ordering or removal of fields in a custom TYPE). Namely, the {{address}} type would be in practice a {{CompositeType(UTF8Type, UTF8Type, Int32Type, SetType(UTF8Type))}} + some metadata that records the name of each component. In other words, this would mostly be user-friendly syntactic sugar to create composite blobs. I'll note that this would also be useful outside collections, as it might sometimes be more efficient/useful to have such simple composite blob. For instance, you could imagine to have a: {noformat} CREATE TYPE fullname ( firstname text, lastname text ) {noformat} and to rewrite the {{users}} table above as {noformat} CREATE TABLE users ( id uuid PRIMARY KEY, name fullname, addresses mapstring, address ) {noformat} In terms of inserts we'd need a syntax for those new struct. Could be: {noformat} INSERT INTO users (id, name) VALUES (2ad..., { firstname: 'Paul', lastname: 'smith'}); UPDATE users SET addresses = address + { 'home': { street: '...', city: 'SF', zip_code: 94102, phones: {} } } WHERE id=2ad...; {noformat} where the difference with a map is that the key would be a column name (in the CQL3 sense), not a value/literal. Though we might find that a bit confusing and find some other syntax. On the query side, we could optionally allow things like: {noformat} SELECT name.firstname, name.lastname FROM users WHERE id=2ad...; {noformat} One open question however is what type do we send back in the result set for a query like: {noformat} SELECT name FROM users WHERE id=2ad...; {noformat} We could: # return just that it's the user defined type named {{address}}, but that imply the client has to query the cluster metadata to find out the definition of the type. # return the full definition of the type every time. I also note that client side, it might be a tad harder to support such types cleanly in statically type languages than in dynamically typed ones, but that's not the end of the world either. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1956) Convert row cache to row+filter cache
[ https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773105#comment-13773105 ] Jonathan Ellis commented on CASSANDRA-1956: --- This is basically a clunkier implementation of CASSANDRA-5357, right? Should we close it as duplicate? Convert row cache to row+filter cache - Key: CASSANDRA-1956 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Vijay Priority: Minor Fix For: 2.1 Attachments: 0001-1956-cache-updates-v0.patch, 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch, 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch, 0002-add-query-cache.patch Changing the row cache to a row+filter cache would make it much more useful. We currently have to warn against using the row cache with wide rows, where the read pattern is typically a peek at the head, but this usecase would be perfect supported by a cache that stored only columns matching the filter. Possible implementations: * (copout) Cache a single filter per row, and leave the cache key as is * Cache a list of filters per row, leaving the cache key as is: this is likely to have some gotchas for weird usage patterns, and it requires the list overheard * Change the cache key to rowkey+filterid: basically ideal, but you need a secondary index to lookup cache entries by rowkey so that you can keep them in sync with the memtable * others? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5357) Query cache
[ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773111#comment-13773111 ] Jonathan Ellis commented on CASSANDRA-5357: --- Getting back to this now that 2.0.1 is done. Sounds reasonable in general. Comments: # I don't think dropping RowCacheSentinel is valid, unfortunately. Otherwise we still have the same problem as CASSANDRA-3862. (Write can invalidate the row, just before cache adds the pre-write value to it, so stale data will be cached indefinitely.) # Serializing the entire QueryCacheValue for each lookup is going to kill performance on hot partitions. (Since you have to deserialize a large chunk of filters to just do the existence check.) Suggest that serializing just the CF data is going to work better. # {{TODO do something here}} looks important :) # There could be any number of queries but the data will not be repeated within them. Clever. # There is a property which user can enable to cache the whole row. Not really a fan but I guess existing row cache users will demand it. :-| # we might be pulling the whole data into memory -- if there's room, that's fine, but exceeding the configured memory budget is Bad. As long as we don't do that I'm fine. Query cache --- Key: CASSANDRA-5357 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Vijay I think that most people expect the row cache to act like a query cache, because that's a reasonable model. Caching the entire partition is, in retrospect, not really reasonable, so it's not surprising that it catches people off guard, especially given the confusion we've inflicted on ourselves as to what a row constitutes. I propose replacing it with a true query cache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name
[ https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5202: -- Summary: CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name (was: CFs should have non deterministic CF ID) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name Key: CASSANDRA-5202 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.9 Environment: OS: Windows 7, Server: Cassandra 1.1.9 release drop Client: astyanax 1.56.21, JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27) Reporter: Marat Bedretdinov Assignee: Yuki Morishita Labels: test Fix For: 2.1 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip Attached is a driver that sequentially: 1. Drops keyspace 2. Creates keyspace 4. Creates 2 column families 5. Seeds 1M rows with 100 columns 6. Queries these 2 column families The above steps are repeated 1000 times. The following exception is observed at random (race - SEDA?): ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:55,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(62819832764241410631599989027761269388, 313a31) in C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception appears in the server at the time of client submitting a query request (row slice) and not at the time data is seeded. The client times out and this data can no longer be queried as the same exception would always occur from there on. Also on iteration 201, it appears that dropping column families failed and as a result their recreation failed with unique column family name violation (see exception below). Note that the data files are actually gone, so it appears that the server runtime responsible for creating column family was out of sync with the piece that dropped them: Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5105 ms Iteration: 200; Total running time for 1000 queries is 232; Average running time of 1000 queries is 0 ms Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5361 ms Iteration: 201; Total running time for 1000 queries is 222; Average running time of 1000 queries is 0 ms Starting dropping column families Starting creating column families Exception in thread main com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), attempts=1]InvalidRequestException(why:Keyspace names must be case-insensitively unique (user_role_reverse_index conflicts with
[jira] [Commented] (CASSANDRA-5351) Avoid repairing already-repaired data by default
[ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773150#comment-13773150 ] Jonathan Ellis commented on CASSANDRA-5351: --- bq. I think it would be simpler to anticompact after repair This is straightforward for STCS (bucket repaired/non-repaired separately) but less so for LCS. Now that we're already doing STCS in L0, I suggest extending that here: reserve the levels for repaired data, and STCS until we can repair. This implies making repair as automatic as compaction, which is a big change for us. I think it's a lot more user friendly, but I'm not 100% confident the performance impact will be negligible. Any better ideas? Avoid repairing already-repaired data by default Key: CASSANDRA-5351 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Lyuben Todorov Labels: repair Fix For: 2.1 Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5357) Query cache
[ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773131#comment-13773131 ] Nate McCall commented on CASSANDRA-5357: bq. There is a property which user can enable to cache the whole row. Not really a fan but I guess existing row cache users will demand it. :-| Depends on how this is exposed to the APIs. If I want to cache the whole row, i'll get my (potentially paged) slice on with 'cache_me=true' or what have you. Particularly with the idea on sharing the same data from different queries. In general, having to pre-heat caches with queries *would not* be a new thing to developers. Query cache --- Key: CASSANDRA-5357 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Vijay I think that most people expect the row cache to act like a query cache, because that's a reasonable model. Caching the entire partition is, in retrospect, not really reasonable, so it's not surprising that it catches people off guard, especially given the confusion we've inflicted on ourselves as to what a row constitutes. I propose replacing it with a true query cache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: Replace Iterable-Collection in SSTCN
Updated Branches: refs/heads/cassandra-2.0 bcb4da739 - 0d976a8fb Replace Iterable-Collection in SSTCN ninja-patch by Aleksey Yeschenko; ninja-reviewed by Jonathan Ellis Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0d976a8f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0d976a8f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0d976a8f Branch: refs/heads/cassandra-2.0 Commit: 0d976a8fb57d6524e81a6a3033f7672e5b2be2ae Parents: bcb4da7 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Sep 20 19:43:15 2013 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Sep 20 19:43:15 2013 +0300 -- .../apache/cassandra/db/ColumnFamilyStore.java | 2 +- .../org/apache/cassandra/db/DataTracker.java| 30 .../db/compaction/LeveledManifest.java | 8 +++--- .../SSTableListChangedNotification.java | 8 -- 4 files changed, 22 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0d976a8f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 4c9f72d..1ff4832 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1066,7 +1066,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean data.markObsolete(sstables, compactionType); } -public void replaceCompactedSSTables(CollectionSSTableReader sstables, IterableSSTableReader replacements, OperationType compactionType) +public void replaceCompactedSSTables(CollectionSSTableReader sstables, CollectionSSTableReader replacements, OperationType compactionType) { data.replaceCompactedSSTables(sstables, replacements, compactionType); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/0d976a8f/src/java/org/apache/cassandra/db/DataTracker.java -- diff --git a/src/java/org/apache/cassandra/db/DataTracker.java b/src/java/org/apache/cassandra/db/DataTracker.java index e7d26b0..f30ec1e 100644 --- a/src/java/org/apache/cassandra/db/DataTracker.java +++ b/src/java/org/apache/cassandra/db/DataTracker.java @@ -44,14 +44,14 @@ public class DataTracker { private static final Logger logger = LoggerFactory.getLogger(DataTracker.class); -public final CollectionINotificationConsumer subscribers = new CopyOnWriteArrayListINotificationConsumer(); +public final CollectionINotificationConsumer subscribers = new CopyOnWriteArrayList(); public final ColumnFamilyStore cfstore; private final AtomicReferenceView view; public DataTracker(ColumnFamilyStore cfstore) { this.cfstore = cfstore; -this.view = new AtomicReferenceView(); +this.view = new AtomicReference(); this.init(); } @@ -231,7 +231,7 @@ public class DataTracker notifySSTablesChanged(sstables, Collections.SSTableReaderemptyList(), compactionType); } -public void replaceCompactedSSTables(CollectionSSTableReader sstables, IterableSSTableReader replacements, OperationType compactionType) +public void replaceCompactedSSTables(CollectionSSTableReader sstables, CollectionSSTableReader replacements, OperationType compactionType) { replace(sstables, replacements); notifySSTablesChanged(sstables, replacements, compactionType); @@ -285,15 +285,13 @@ public class DataTracker void removeUnreadableSSTables(File directory) { View currentView, newView; -ListSSTableReader remaining = new ArrayListSSTableReader(); +ListSSTableReader remaining = new ArrayList(); do { currentView = view.get(); for (SSTableReader r : currentView.nonCompactingSStables()) -{ if (!r.descriptor.directory.equals(directory)) remaining.add(r); -} if (remaining.size() == currentView.nonCompactingSStables().size()) return; @@ -379,9 +377,7 @@ public class DataTracker { long n = 0; for (SSTableReader sstable : getSSTables()) -{ n += sstable.estimatedKeys(); -} return n; } @@ -415,13 +411,11 @@ public class DataTracker return 0; } -public void notifySSTablesChanged(IterableSSTableReader removed, IterableSSTableReader added, OperationType compactionType) +public void notifySSTablesChanged(CollectionSSTableReader removed, CollectionSSTableReader
[jira] [Updated] (CASSANDRA-5351) Avoid repairing already-repaired data by default
[ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5351: -- Reviewer: slebresne Assignee: Lyuben Todorov Avoid repairing already-repaired data by default Key: CASSANDRA-5351 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Lyuben Todorov Labels: repair Fix For: 2.1 Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-5351) Avoid repairing already-repaired data by default
[ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773150#comment-13773150 ] Jonathan Ellis edited comment on CASSANDRA-5351 at 9/20/13 4:46 PM: bq. The more often you repair the less big a full separate set of levels for unrepaired data would be. So maybe that's the way to go. Which is to say, we'd be kicking repairs off as automatically as we currently kick off compaction. I still don't have any better ideas. [~krummas]? was (Author: jbellis): bq. I think it would be simpler to anticompact after repair This is straightforward for STCS (bucket repaired/non-repaired separately) but less so for LCS. Now that we're already doing STCS in L0, I suggest extending that here: reserve the levels for repaired data, and STCS until we can repair. This implies making repair as automatic as compaction, which is a big change for us. I think it's a lot more user friendly, but I'm not 100% confident the performance impact will be negligible. Any better ideas? Avoid repairing already-repaired data by default Key: CASSANDRA-5351 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Lyuben Todorov Labels: repair Fix For: 2.1 Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[3/5] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bcb4da73 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bcb4da73 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bcb4da73 Branch: refs/heads/trunk Commit: bcb4da739cc6a0fdb83f49772dc0de1659bc8ced Parents: eb96db6 a0fa697 Author: Dave Brosius dbros...@apache.org Authored: Fri Sep 20 10:49:10 2013 -0400 Committer: Dave Brosius dbros...@apache.org Committed: Fri Sep 20 10:49:10 2013 -0400 -- src/java/org/apache/cassandra/cql3/Cql.g | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bcb4da73/src/java/org/apache/cassandra/cql3/Cql.g -- diff --cc src/java/org/apache/cassandra/cql3/Cql.g index 6fb0db4,7101c71..17afb00 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@@ -869,10 -789,9 +869,11 @@@ relation[ListRelation clauses for (ColumnIdentifier id : l) $clauses.add(new Relation(id, type, t, true)); } +| name=cident K_IN { Term.Raw marker = null; } (QMARK { marker = newINBindVariables(null); } | ':' mid=cident { marker = newINBindVariables(mid); }) +{ $clauses.add(new Relation(name, Relation.Type.IN, marker)); } | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); } '(' ( f1=term { rel.addInValue(f1); } (',' fN=term { rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); } + | '(' relation[$clauses] ')' ; comparatorType returns [CQL3Type t]
[1/5] git commit: Update versions for 2.0.1 release
Updated Branches: refs/heads/trunk 309324171 - 4fb090481 Update versions for 2.0.1 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb96db6c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb96db6c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb96db6c Branch: refs/heads/trunk Commit: eb96db6c19515e6d1215230f29d25b46fcd005ef Parents: 742f6a3 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu Sep 19 13:48:30 2013 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri Sep 20 12:19:34 2013 +0200 -- NEWS.txt | 2 +- build.xml| 2 +- debian/changelog | 6 ++ 3 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 3712073..fc257f4 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -13,7 +13,7 @@ restore snapshots created with the previous major version using the 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. -2.0.2 +2.0.1 = Upgrading http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/build.xml -- diff --git a/build.xml b/build.xml index 72291e2..5731540 100644 --- a/build.xml +++ b/build.xml @@ -25,7 +25,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information -- -property name=base.version value=2.0.0/ +property name=base.version value=2.0.1/ property name=scm.connection value=scm:git://git.apache.org/cassandra.git/ property name=scm.developerConnection value=scm:git://git.apache.org/cassandra.git/ property name=scm.url value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/ http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb96db6c/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 9905726..61a91d7 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (2.0.1) unstable; urgency=low + + * New release + + -- Sylvain Lebresne slebre...@apache.org Thu, 19 Sep 2013 13:47:16 +0200 + cassandra (2.0.0) unstable; urgency=low * New release
[5/5] git commit: Merge branch 'cassandra-2.0' into trunk
Merge branch 'cassandra-2.0' into trunk Conflicts: NEWS.txt build.xml Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4fb09048 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4fb09048 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4fb09048 Branch: refs/heads/trunk Commit: 4fb090481d508aa9c1c18d79cd012702dfc8f45f Parents: 3093241 0d976a8 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Sep 20 19:49:00 2013 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Sep 20 19:49:00 2013 +0300 -- NEWS.txt| 4 ++- debian/changelog| 6 src/java/org/apache/cassandra/cql3/Cql.g| 1 + .../apache/cassandra/db/ColumnFamilyStore.java | 2 +- .../org/apache/cassandra/db/DataTracker.java| 30 .../db/compaction/LeveledManifest.java | 8 +++--- .../SSTableListChangedNotification.java | 8 -- 7 files changed, 32 insertions(+), 27 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fb09048/NEWS.txt -- diff --cc NEWS.txt index 8990769,fc257f4..a7576d2 --- a/NEWS.txt +++ b/NEWS.txt @@@ -13,18 -13,7 +13,20 @@@ restore snapshots created with the prev 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. ++ +2.1 +=== ++ +Upgrading +- + - Rolling upgrades from anything pre-2.0 is not supported. + - For leveled compaction users, 2.0 must be atleast started before + upgrading to 2.1 due to the fact that the old JSON leveled + manifest is migrated into the sstable metadata files on startup + in 2.0 and this code is gone from 2.1. + + - 2.0.2 + 2.0.1 = Upgrading http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fb09048/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4fb09048/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- diff --cc src/java/org/apache/cassandra/db/compaction/LeveledManifest.java index 2d5aa27,82aa2d6..23f842d --- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java +++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java @@@ -125,12 -127,12 +125,12 @@@ public class LeveledManifes return newLevel; } - public synchronized void replace(IterableSSTableReader removed, IterableSSTableReader added) + public synchronized void replace(CollectionSSTableReader removed, CollectionSSTableReader added) { - assert !Iterables.isEmpty(removed); // use add() instead of promote when adding new sstables + assert !removed.isEmpty(); // use add() instead of promote when adding new sstables logDistribution(); if (logger.isDebugEnabled()) -logger.debug(Replacing [ + toString(removed) + ]); +logger.debug(Replacing [{}], toString(removed)); // the level for the added sstables is the max of the removed ones, // plus one if the removed were all on the same level
[4/5] git commit: Replace Iterable-Collection in SSTCN
Replace Iterable-Collection in SSTCN ninja-patch by Aleksey Yeschenko; ninja-reviewed by Jonathan Ellis Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0d976a8f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0d976a8f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0d976a8f Branch: refs/heads/trunk Commit: 0d976a8fb57d6524e81a6a3033f7672e5b2be2ae Parents: bcb4da7 Author: Aleksey Yeschenko alek...@apache.org Authored: Fri Sep 20 19:43:15 2013 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Fri Sep 20 19:43:15 2013 +0300 -- .../apache/cassandra/db/ColumnFamilyStore.java | 2 +- .../org/apache/cassandra/db/DataTracker.java| 30 .../db/compaction/LeveledManifest.java | 8 +++--- .../SSTableListChangedNotification.java | 8 -- 4 files changed, 22 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0d976a8f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 4c9f72d..1ff4832 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1066,7 +1066,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean data.markObsolete(sstables, compactionType); } -public void replaceCompactedSSTables(CollectionSSTableReader sstables, IterableSSTableReader replacements, OperationType compactionType) +public void replaceCompactedSSTables(CollectionSSTableReader sstables, CollectionSSTableReader replacements, OperationType compactionType) { data.replaceCompactedSSTables(sstables, replacements, compactionType); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/0d976a8f/src/java/org/apache/cassandra/db/DataTracker.java -- diff --git a/src/java/org/apache/cassandra/db/DataTracker.java b/src/java/org/apache/cassandra/db/DataTracker.java index e7d26b0..f30ec1e 100644 --- a/src/java/org/apache/cassandra/db/DataTracker.java +++ b/src/java/org/apache/cassandra/db/DataTracker.java @@ -44,14 +44,14 @@ public class DataTracker { private static final Logger logger = LoggerFactory.getLogger(DataTracker.class); -public final CollectionINotificationConsumer subscribers = new CopyOnWriteArrayListINotificationConsumer(); +public final CollectionINotificationConsumer subscribers = new CopyOnWriteArrayList(); public final ColumnFamilyStore cfstore; private final AtomicReferenceView view; public DataTracker(ColumnFamilyStore cfstore) { this.cfstore = cfstore; -this.view = new AtomicReferenceView(); +this.view = new AtomicReference(); this.init(); } @@ -231,7 +231,7 @@ public class DataTracker notifySSTablesChanged(sstables, Collections.SSTableReaderemptyList(), compactionType); } -public void replaceCompactedSSTables(CollectionSSTableReader sstables, IterableSSTableReader replacements, OperationType compactionType) +public void replaceCompactedSSTables(CollectionSSTableReader sstables, CollectionSSTableReader replacements, OperationType compactionType) { replace(sstables, replacements); notifySSTablesChanged(sstables, replacements, compactionType); @@ -285,15 +285,13 @@ public class DataTracker void removeUnreadableSSTables(File directory) { View currentView, newView; -ListSSTableReader remaining = new ArrayListSSTableReader(); +ListSSTableReader remaining = new ArrayList(); do { currentView = view.get(); for (SSTableReader r : currentView.nonCompactingSStables()) -{ if (!r.descriptor.directory.equals(directory)) remaining.add(r); -} if (remaining.size() == currentView.nonCompactingSStables().size()) return; @@ -379,9 +377,7 @@ public class DataTracker { long n = 0; for (SSTableReader sstable : getSSTables()) -{ n += sstable.estimatedKeys(); -} return n; } @@ -415,13 +411,11 @@ public class DataTracker return 0; } -public void notifySSTablesChanged(IterableSSTableReader removed, IterableSSTableReader added, OperationType compactionType) +public void notifySSTablesChanged(CollectionSSTableReader removed, CollectionSSTableReader added, OperationType compactionType) { +INotification
[jira] [Resolved] (CASSANDRA-5220) Repair improvements when using vnodes
[ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-5220. --- Resolution: Later Assignee: (was: Yuki Morishita) I suspect and hope that CASSANDRA-5351 will speed up repair enough that we won't need to tweak around the edges like this. Repair improvements when using vnodes - Key: CASSANDRA-5220 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.2.0 beta 1 Reporter: Brandon Williams Fix For: 2.1 Currently when using vnodes, repair takes much longer to complete than without them. This appears at least in part because it's using a session per range and processing them sequentially. This generates a lot of log spam with vnodes, and while being gentler and lighter on hard disk deployments, ssd-based deployments would often prefer that repair be as fast as possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[2/5] git commit: allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037
allow parenthesis around where conditions in cql patch by dbrosius reviewed by slebresne for cassandra-6037 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a0fa6971 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a0fa6971 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a0fa6971 Branch: refs/heads/trunk Commit: a0fa69715f7913804fbd55c1280e0d35edd3bf0f Parents: df046d6 Author: Dave Brosius dbros...@apache.org Authored: Fri Sep 20 10:47:36 2013 -0400 Committer: Dave Brosius dbros...@apache.org Committed: Fri Sep 20 10:47:36 2013 -0400 -- src/java/org/apache/cassandra/cql3/Cql.g | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0fa6971/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 2445bf2..7101c71 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -791,6 +791,7 @@ relation[ListRelation clauses] } | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); } '(' ( f1=term { rel.addInValue(f1); } (',' fN=term { rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); } +| '(' relation[$clauses] ')' ; comparatorType returns [CQL3Type t]
[jira] [Created] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema
Sam Tunnicliffe created CASSANDRA-6071: -- Summary: CqlStorage loading compact table adds an extraneous field to the pig schema Key: CASSANDRA-6071 URL: https://issues.apache.org/jira/browse/CASSANDRA-6071 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor {code} CREATE TABLE t ( key text, field1 int, field2 int PRIMARY KEY (key, field1) ) WITH COMPACT STORAGE; INSERT INTO t (key,field1,field2) VALUES ('key1',1,2); INSERT INTO t (key,field1,field2) VALUES ('key2',1,2); INSERT INTO t (key,field1,field2) VALUES ('key3',1,2); {code} {code} grunt t = LOAD 'cql://ks/t' USING CqlStorage(); grunt describe t; t: {key: chararray,field1: int,field2: int,value: int} dump t; (key1,1,2,) (key3,1,2,) (key2,1,2,) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema
[ https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-6071: --- Attachment: 6071.txt CqlStorage loading compact table adds an extraneous field to the pig schema --- Key: CASSANDRA-6071 URL: https://issues.apache.org/jira/browse/CASSANDRA-6071 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Attachments: 6071.txt {code} CREATE TABLE t ( key text, field1 int, field2 int PRIMARY KEY (key, field1) ) WITH COMPACT STORAGE; INSERT INTO t (key,field1,field2) VALUES ('key1',1,2); INSERT INTO t (key,field1,field2) VALUES ('key2',1,2); INSERT INTO t (key,field1,field2) VALUES ('key3',1,2); {code} {code} grunt t = LOAD 'cql://ks/t' USING CqlStorage(); grunt describe t; t: {key: chararray,field1: int,field2: int,value: int} dump t; (key1,1,2,) (key3,1,2,) (key2,1,2,) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema
[ https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773191#comment-13773191 ] Sam Tunnicliffe commented on CASSANDRA-6071: This is caused by an additional ColumnDef being added in the CfDef used to construct the Pig schema. Where a value_alias is specified for the compact value column, CqlStorage.getKeysMeta adds a ColumnDef for it. The additional one is then added in AbstractCassandaStorage.getColumnMeta, so I've added an additional boolean arg to indicate whether the the value_alias has already been processed. I've also refactored ACS.getColumnMeta a bit to (hopefully) make the logic clearer. CqlStorage loading compact table adds an extraneous field to the pig schema --- Key: CASSANDRA-6071 URL: https://issues.apache.org/jira/browse/CASSANDRA-6071 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Attachments: 6071.txt {code} CREATE TABLE t ( key text, field1 int, field2 int PRIMARY KEY (key, field1) ) WITH COMPACT STORAGE; INSERT INTO t (key,field1,field2) VALUES ('key1',1,2); INSERT INTO t (key,field1,field2) VALUES ('key2',1,2); INSERT INTO t (key,field1,field2) VALUES ('key3',1,2); {code} {code} grunt t = LOAD 'cql://ks/t' USING CqlStorage(); grunt describe t; t: {key: chararray,field1: int,field2: int,value: int} dump t; (key1,1,2,) (key3,1,2,) (key2,1,2,) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4775) Counters 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773230#comment-13773230 ] Craig Hansen commented on CASSANDRA-4775: - I hate to pollute such a scholarly thread with this comment. But I've been researching all of the potential issues with cassandra counters for several days now, and I have to say I'm not too encouraged by everything I'm reading. I love how they work in development, however - just brilliant. While there is a lot of information relating to potential problems, there seems to be very little consensus regarding potential solutions. In my case, I'm just trying to figure out if they are good enough for my use cases, and whether or not there is any way to configure a cassandra cluster specifically to mitigate some of the risks of using counters. I'd be willing to create a specialized cluster for counter column families if the risks could be mitigated through configuration, various write consistency levels, etc. So at this point we're looking at using redis sets or cassandra counters intra-day just for speed, and summarizing transactional data to cassandra integer columns periodically for durability and historical accuracy. Any links to resources for such solutions would be greatly appreciated. Also any practical information relating to just how fragile they have proven to be would be helpful. The main thing is strategic: It would really help if I could get a sense of whether resolving counter issues is on the roadmap, or if they will remain in the OK use them, but it's gonna be dumb category. Counters 2.0 Key: CASSANDRA-4775 URL: https://issues.apache.org/jira/browse/CASSANDRA-4775 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Arya Goudarzi Assignee: Aleksey Yeschenko Labels: counters Fix For: 2.1 The existing partitioned counters remain a source of frustration for most users almost two years after being introduced. The remaining problems are inherent in the design, not something that can be fixed given enough time/eyeballs. Ideally a solution would give us - similar performance - less special cases in the code - potential for a retry mechanism -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5540) Concurrent secondary index updates remove rows from the index
[ https://issues.apache.org/jira/browse/CASSANDRA-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773286#comment-13773286 ] Robert Coli commented on CASSANDRA-5540: Is it a correct assessment that this issues affects begins with the initial implementation of 2i in 0.7? If not, when? Concurrent secondary index updates remove rows from the index - Key: CASSANDRA-5540 URL: https://issues.apache.org/jira/browse/CASSANDRA-5540 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 Reporter: Alexei Bakanov Assignee: Sam Tunnicliffe Fix For: 1.2.5 Attachments: 0001-Use-different-index-updater-for-live-updates-compact.patch, 5540.txt Existing rows disappear from secondary index when doing simultaneous updates of a row with the same secondary index value. Here is a little pycassa script that reproduces a bug. The script inserts 4 rows with same secondary index value, reads those rows back and check that there are 4 of them. Please run two instances of the script simultaneously in two separate terminals in order to simulate concurrent updates: {code} -scrpit.py START- import pycassa from pycassa.index import * pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 'cf1') while True: for rowKey in xrange(4): cf.insert(str(rowKey), {'indexedColumn': 'indexedValue'}) index_expression = create_index_expression('indexedColumn', 'indexedValue') index_clause = create_index_clause([index_expression]) rows = cf.get_indexed_slices(index_clause) length = len(list(rows)) if length == 4: pass else: print 'found just %d rows out of 4' % length pool.dispose() ---script.py FINISH--- ---schema cli start--- create keyspace ks123 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 1} and durable_writes = true; use ks123; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and populate_io_cache_on_flush = false and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : AsciiType, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; ---schema cli finish--- {code} Test cluster created with 'ccm create --cassandra-version 1.2.4 --nodes 1 --start testUpdate' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5351) Avoid repairing already-repaired data by default
[ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773178#comment-13773178 ] Benjamin Coverston commented on CASSANDRA-5351: --- I think this is a good idea. In fact, triggering when the un-repaired size gets to be above some configured threshold means that repairs can be tuned for your environment. If we had a set of SSTables that are 'repaired', another 'pending repair', and another set of 'not yet scheduled for repair' (or queued for future repair) you could actually monitor your system to see if repairs are 'keeping up' with your data volumes. Avoid repairing already-repaired data by default Key: CASSANDRA-5351 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Lyuben Todorov Labels: repair Fix For: 2.1 Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5502) Secondary Index Storage
[ https://issues.apache.org/jira/browse/CASSANDRA-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773276#comment-13773276 ] Robert Coli commented on CASSANDRA-5502: You can do this without a Cassandra code change by simply implementing your own secondary index columnfamily. That said, given secondary indexes are columnfamilies and columnfamilies are in their own directories in part so that you can store them on different mountpoints, this ticket seems reasonable to me. +1 Secondary Index Storage --- Key: CASSANDRA-5502 URL: https://issues.apache.org/jira/browse/CASSANDRA-5502 Project: Cassandra Issue Type: Improvement Reporter: Brooke Bryan Currently, both the CF data, and the secondary index data are stored within the same folder on disk. Being able to split these into separate folders could be a great improvement with performance. With our data, and secondary index, we will query the index a lot more, so can optimise hardware for the two traffic shapes coming through. Something like /Keyspace/CF/indexes/indexname/*.db Not too sure how much would be involved to change this, but I would imagine fairly cheap cost to reward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773293#comment-13773293 ] Robert Coli commented on CASSANDRA-5632: Do you have an affects version for this issue? Description says it started when a re-write for 2.0 started, but it affects 1.2.x so I'm confused? :D Cross-DC bandwidth-saving broken Key: CASSANDRA-5632 URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Jonathan Ellis Assignee: Jonathan Ellis Fix For: 1.2.6 Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, fix_patch_bug.log We group messages by destination as follows to avoid sending multiple messages to a remote datacenter: {code} // Multimap that holds onto all the messages and addresses meant for a specific datacenter MapString, MultimapMessage, InetAddress dcMessages {code} When we cleaned out the MessageProducer stuff for 2.0, this code {code} MultimapMessage, InetAddress messages = dcMessages.get(dc); ... messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), destination); {code} turned into {code} MultimapMessageOut, InetAddress messages = dcMessages.get(dc); ... messages.put(rm.createMessage(), destination); {code} Thus, we weren't actually grouping anything anymore -- each destination replica was stored under a separate Message key, unlike under the old CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2264) Restore From Snapshot eg: nodetool restore [filename] | ([keyspace] [columnfamily])
[ https://issues.apache.org/jira/browse/CASSANDRA-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Coverston resolved CASSANDRA-2264. --- Resolution: Not A Problem A lot has changed since I created this. It doesn't make much sense to pursue this today. Restore From Snapshot eg: nodetool restore [filename] | ([keyspace] [columnfamily]) - Key: CASSANDRA-2264 URL: https://issues.apache.org/jira/browse/CASSANDRA-2264 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.2 Reporter: Benjamin Coverston Store additional metadata in the SSTable including: generated_by: (From memtable, compaction, or cleanup) ancestors: A list of SSTableNames|MD5sum When executed it will copy the ancestors from the snapshot directory to the data directory. If given Keyspace and ColumnFamily arguments it will attempt to restore all SSTables in the Keyspace/Columnfamily on that node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2731) Impelement in-house file caching.
[ https://issues.apache.org/jira/browse/CASSANDRA-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773389#comment-13773389 ] Jonathan Ellis commented on CASSANDRA-2731: --- Closing in favor of CASSANDRA-5863 Impelement in-house file caching. - Key: CASSANDRA-2731 URL: https://issues.apache.org/jira/browse/CASSANDRA-2731 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Implement FileCache, CachedRandomAccessFile (to replace BufferedRandomAccessFile) and RadixTree (to play role of the backend cache storage) classes. FileCache class with be responsible for storing/retrieving data from Radix Tree and also flushing of the dirty pages to the disk, page management such as adding new pages, utilizing old/unused pages. CRAF Linux only features (via JNI): 1). O_DIRECT for both read/write operations. 2). AIO's lio_listio write operation batching. Provide possibility to migrate hot data directly from Memtable to CRAF cache to keep live-reads data always hot in memory. To minimise compaction effects CRAF should provide a way to by-pass a caching data if it does not already exists. Provide a way to make pointers in the cache which will be useful to minimize impact on performance when a single column is distributed among multiple SSTable files (except counter columns). Use jemalloc (http://www.canonware.com/jemalloc/) for cache memory management. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2731) Impelement in-house file caching.
[ https://issues.apache.org/jira/browse/CASSANDRA-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2731. --- Resolution: Won't Fix Impelement in-house file caching. - Key: CASSANDRA-2731 URL: https://issues.apache.org/jira/browse/CASSANDRA-2731 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Implement FileCache, CachedRandomAccessFile (to replace BufferedRandomAccessFile) and RadixTree (to play role of the backend cache storage) classes. FileCache class with be responsible for storing/retrieving data from Radix Tree and also flushing of the dirty pages to the disk, page management such as adding new pages, utilizing old/unused pages. CRAF Linux only features (via JNI): 1). O_DIRECT for both read/write operations. 2). AIO's lio_listio write operation batching. Provide possibility to migrate hot data directly from Memtable to CRAF cache to keep live-reads data always hot in memory. To minimise compaction effects CRAF should provide a way to by-pass a caching data if it does not already exists. Provide a way to make pointers in the cache which will be useful to minimize impact on performance when a single column is distributed among multiple SSTable files (except counter columns). Use jemalloc (http://www.canonware.com/jemalloc/) for cache memory management. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2830) Allow summing of counter columns in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2830. --- Resolution: Fixed Allow summing of counter columns in CQL --- Key: CASSANDRA-2830 URL: https://issues.apache.org/jira/browse/CASSANDRA-2830 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tomas Salfischberger Priority: Minor Labels: CQL CQL could be extended with a method to calculate the sum of a set of counter columns. This avoids transferring a long list of counter columns to be summed by the client, while the server could calculate the total and instead only transfer that result. My proposal for the syntax (based on the COUNT() suggestion in the comments of CASSANDRA-1704): {code}SELECT SUM(columnFrom..columnTo) FROM CF WHERE ...{code} The simplest approach would be to only allow summing of counters under the same key, thus a query with a WHERE part that specifies multiple keys would return 1 result per key. This avoids summing values from different nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2830) Allow summing of counter columns in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773390#comment-13773390 ] Jonathan Ellis commented on CASSANDRA-2830: --- Superceded by CASSANDRA-4914 Allow summing of counter columns in CQL --- Key: CASSANDRA-2830 URL: https://issues.apache.org/jira/browse/CASSANDRA-2830 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tomas Salfischberger Priority: Minor Labels: CQL CQL could be extended with a method to calculate the sum of a set of counter columns. This avoids transferring a long list of counter columns to be summed by the client, while the server could calculate the total and instead only transfer that result. My proposal for the syntax (based on the COUNT() suggestion in the comments of CASSANDRA-1704): {code}SELECT SUM(columnFrom..columnTo) FROM CF WHERE ...{code} The simplest approach would be to only allow summing of counters under the same key, thus a query with a WHERE part that specifies multiple keys would return 1 result per key. This avoids summing values from different nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema
[ https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-6071: Reviewer: alexliu68 (was: brandon.williams) CqlStorage loading compact table adds an extraneous field to the pig schema --- Key: CASSANDRA-6071 URL: https://issues.apache.org/jira/browse/CASSANDRA-6071 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Fix For: 1.2.11 Attachments: 6071.txt {code} CREATE TABLE t ( key text, field1 int, field2 int PRIMARY KEY (key, field1) ) WITH COMPACT STORAGE; INSERT INTO t (key,field1,field2) VALUES ('key1',1,2); INSERT INTO t (key,field1,field2) VALUES ('key2',1,2); INSERT INTO t (key,field1,field2) VALUES ('key3',1,2); {code} {code} grunt t = LOAD 'cql://ks/t' USING CqlStorage(); grunt describe t; t: {key: chararray,field1: int,field2: int,value: int} dump t; (key1,1,2,) (key3,1,2,) (key2,1,2,) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2068) Improvements for Multi-tenant clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2068. --- Resolution: Later Improvements for Multi-tenant clusters -- Key: CASSANDRA-2068 URL: https://issues.apache.org/jira/browse/CASSANDRA-2068 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Priority: Minor It would be helpful if we could actually set some limits per CF to help Multi-tenant clusters. Here are some ideas I was thinking: (per CF) 1. Set an upper bound (max) for count when slicing or multi/get calls 2. Set an upper bound (max) for how much data in bytes can be returned (64KB, 512KB, 1MB, etc) This would introduce new exceptions that can be thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1817) Dynamic Read Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1817. --- Resolution: Later Dynamic Read Repair --- Key: CASSANDRA-1817 URL: https://issues.apache.org/jira/browse/CASSANDRA-1817 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Priority: Minor Labels: ponies Read repair could (temporarily) adjust its own frequency (from the baseline) based on the necessity of the repair for particular nodes. For example, a successful read repair (caused data to be repaired) should bump the frequency for the node that needed repair, with the goal that a node that has been offline for a while should trend toward 100% read repair while it is successful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2609) Repair multi-DC awareness
[ https://issues.apache.org/jira/browse/CASSANDRA-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773387#comment-13773387 ] Jonathan Ellis commented on CASSANDRA-2609: --- I'm going to predict that this is unnecessary complexity once CASSANDRA-5351 and possibly CASSANDRA-3362 are done. Repair multi-DC awareness - Key: CASSANDRA-2609 URL: https://issues.apache.org/jira/browse/CASSANDRA-2609 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Priority: Minor Labels: ponies Repair has no multi-DC awareness in that if you have 2 DC with 3 replica in each, a repair of a node will transit 3 merkle tree cross-DC and potentially initiate as many cross-DC streaming (with likely a non null intersection between those). In theory, we could repair separately in each DC, then repair between only two node cross-DC (that we know are up to date in their respective DC) and finally re-repair in the separate DC (to pass the cross-DC changes to all nodes of the DC). It is yet unclear to me if we can make that efficient enough that it is worth the added complexity, so this is really meant as an exploratory ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1599) Add sort/order support for secondary indexing
[ https://issues.apache.org/jira/browse/CASSANDRA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1599. --- Resolution: Later Add sort/order support for secondary indexing - Key: CASSANDRA-1599 URL: https://issues.apache.org/jira/browse/CASSANDRA-1599 Project: Cassandra Issue Type: New Feature Components: API Reporter: Todd Nine Original Estimate: 32h Remaining Estimate: 32h For a lot of users paging is a standard use case on many web applications. It would be nice to allow paging as part of a Boolean Expression. Page - start index - end index - page timestamp - Sort Order When sorting, is it possible to sort both ASC and DESC? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5351) Avoid repairing already-repaired data by default
[ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773398#comment-13773398 ] Jonathan Ellis commented on CASSANDRA-5351: --- Then the question becomes, who should be the repair coordinator? If you say the first replicay from the ReplicationStrategy, then what do you do if that node is down while other nodes receive data, and doesn't know it's supposed to kick one off? It's a whole can of worms that I'd rather not open so I hope someone has a better idea. :) Avoid repairing already-repaired data by default Key: CASSANDRA-5351 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Lyuben Todorov Labels: repair Fix For: 2.1 Repair has always built its merkle tree from all the data in a columnfamily, which is guaranteed to work but is inefficient. We can improve this by remembering which sstables have already been successfully repaired, and only repairing sstables new since the last repair. (This automatically makes CASSANDRA-3362 much less of a problem too.) The tricky part is, compaction will (if not taught otherwise) mix repaired data together with non-repaired. So we should segregate unrepaired sstables from the repaired ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2609) Repair multi-DC awareness
[ https://issues.apache.org/jira/browse/CASSANDRA-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2609. --- Resolution: Won't Fix Repair multi-DC awareness - Key: CASSANDRA-2609 URL: https://issues.apache.org/jira/browse/CASSANDRA-2609 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Priority: Minor Labels: ponies Repair has no multi-DC awareness in that if you have 2 DC with 3 replica in each, a repair of a node will transit 3 merkle tree cross-DC and potentially initiate as many cross-DC streaming (with likely a non null intersection between those). In theory, we could repair separately in each DC, then repair between only two node cross-DC (that we know are up to date in their respective DC) and finally re-repair in the separate DC (to pass the cross-DC changes to all nodes of the DC). It is yet unclear to me if we can make that efficient enough that it is worth the added complexity, so this is really meant as an exploratory ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1657) support in-memory column families
[ https://issues.apache.org/jira/browse/CASSANDRA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1657. --- Resolution: Later support in-memory column families - Key: CASSANDRA-1657 URL: https://issues.apache.org/jira/browse/CASSANDRA-1657 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Priority: Minor Some workloads are such that you absolutely depend on column families being in-memory for performance, yet you most definitely want all the things that Cassandra offers in terms of replication, consistency, durability etc. In order to semi-deterministically ensure acceptable performance for such data, Cassandra could support in-memory column families. Such an in-memory column family would imply that mlock() be used on sstables for this column family. On start-up and on compaction completion, they could be mmap():ed with MAP_POPULATE (Linux specific) or else just mmap():ed + mlock():ed in such a way as to otherwise guarantee it is in-memory (such as userland traversal of the entire file). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3487) better repair session timeouts and retrys
[ https://issues.apache.org/jira/browse/CASSANDRA-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773403#comment-13773403 ] Jonathan Ellis commented on CASSANDRA-3487: --- Is this still an issue [~yukim]? better repair session timeouts and retrys - Key: CASSANDRA-3487 URL: https://issues.apache.org/jira/browse/CASSANDRA-3487 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Priority: Minor Fix For: 2.1 It would be great if we can timeout a validation compaction which is taking long or had an exception while doing a Validation. Repair can gossip its status to all the other nodes, hence any node which is waiting for response of a tree request to wait until it complete, if the repair is not going to complete because of exception or because it is too busy taking the incoming request we can timeout the user request. Bonus: By displaying the repair gossip via nodetool, user/script running the request can have a better handle on whats going on in the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3560) incorrect description for nodetool scrub/upgradesstables and invalidaterow/keycache
[ https://issues.apache.org/jira/browse/CASSANDRA-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3560. --- Resolution: Cannot Reproduce incorrect description for nodetool scrub/upgradesstables and invalidaterow/keycache --- Key: CASSANDRA-3560 URL: https://issues.apache.org/jira/browse/CASSANDRA-3560 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.5 Reporter: Ramesh Natarajan Priority: Trivial Description for the following commands needs to be corrected scrub [keyspace] [cfnames] - Scrub (rebuild sstables for) one or more column family upgradesstables [keyspace] [cfnames] - Scrub (rebuild sstables for) one or more column family invalidatekeycache [keyspace] [cfnames] - Invalidate the key cache of one or more column family invalidaterowcache [keyspace] [cfnames] - Invalidate the key cache of one or more column family -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3499) Make the 'load' interface pluggable
[ https://issues.apache.org/jira/browse/CASSANDRA-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3499. --- Resolution: Later Make the 'load' interface pluggable --- Key: CASSANDRA-3499 URL: https://issues.apache.org/jira/browse/CASSANDRA-3499 Project: Cassandra Issue Type: Bug Reporter: Chris Goffinet Priority: Minor We should make the 'Load' attribute of the cluster at least pluggable. One use case we had was we could build a plugin that was specific to us, that could be tied directly into our time series database we have for all of our infrastructure. This would allow us to populate and expose more data per node instead of needing to gossip this data around (CPU/Network/Memory/etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1472. --- Resolution: Later Reviewer: (was: tjake) Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-1472-rebased-to-0.7-branch.txt, ASF.LICENSE.NOT.GRANTED--0019-Rename-bugfixes-and-fileclose.txt, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1598) Add Boolean Expression to secondary querying
[ https://issues.apache.org/jira/browse/CASSANDRA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1598. --- Resolution: Later Add Boolean Expression to secondary querying Key: CASSANDRA-1598 URL: https://issues.apache.org/jira/browse/CASSANDRA-1598 Project: Cassandra Issue Type: New Feature Components: API Affects Versions: 0.7 beta 3 Reporter: Todd Nine Add boolean operators similar to Lucene style searches. Currently there is implicit support for the operator. It would be helpful to also add support for ||/Union operators. I would envision this as the client would be required to construct the expression tree and pass it via the thrift interface. BooleanExpression -- BooleanOrIndexExpression -- BooleanOperator -- BooleanOrIndexExpression I'd like to take a crack at this since it will greatly improve my Datanucleus plugin -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773408#comment-13773408 ] Jonathan Ellis commented on CASSANDRA-3578: --- I've seen several cases of nodes failing to keep up with write requests where the commitlog was the bottleneck. These were all workloads throwing MB-sized columns around. Granted, that's not exactly our bread and butter. Multithreaded commitlog --- Key: CASSANDRA-3578 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Priority: Minor Attachments: 0001-CASSANDRA-3578.patch, parallel_commit_log_2.patch Brian Aker pointed out a while ago that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first, the way we do in the SlabAllocator.Region.allocate) can improve performance, since you're not bottlenecking on a single thread to do all the copying and CRC computation. Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes doable. (moved from CASSANDRA-622, which was getting a bit muddled.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3605) Exception swallowing in Hex.java
[ https://issues.apache.org/jira/browse/CASSANDRA-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3605. --- Resolution: Won't Fix Exception swallowing in Hex.java Key: CASSANDRA-3605 URL: https://issues.apache.org/jira/browse/CASSANDRA-3605 Project: Cassandra Issue Type: Improvement Affects Versions: 1.0.5 Environment: all Reporter: Zoltan Farkas Priority: Minor org.apache.cassandra.utils.Hex line 94: try { s = stringConstructor.newInstance(0, c.length, c); } catch (Exception e) { // Swallowing as we'll just use a copying constructor } this code does not comply with coding standard, caught exception needs to be rethrown as RuntimeException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3763) compactionstats throws ArithmeticException: / by zero
[ https://issues.apache.org/jira/browse/CASSANDRA-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3763. --- Resolution: Cannot Reproduce compactionstats throws ArithmeticException: / by zero - Key: CASSANDRA-3763 URL: https://issues.apache.org/jira/browse/CASSANDRA-3763 Project: Cassandra Issue Type: Bug Components: Core, Tools Affects Versions: 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4 Environment: debian linux - openvz kernel, oracle java 1.6.0.26 Reporter: Zenek Kraweznik Priority: Trivial compactionstats looks like this: # nodetool -h localhost compactionstats Exception in thread main java.lang.ArithmeticException: / by zero at org.apache.cassandra.db.compaction.LeveledManifest.getEstimatedTasks(LeveledManifest.java:435) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getEstimatedRemainingTasks(LeveledCompactionStrategy.java:128) at org.apache.cassandra.db.compaction.CompactionManager.getPendingTasks(CompactionManager.java:1060) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) # nodetool is working fine in other actions: # nodetool -h localhost netstats Mode: NORMAL Not sending any streams. Not receiving any streams. Pool NameActive Pending Completed Commandsn/a 0 2 Responses n/a 0 1810 # -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3746) updating row_cache_provider does not take affect until after a node is restarted
[ https://issues.apache.org/jira/browse/CASSANDRA-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3746. --- Resolution: Later updating row_cache_provider does not take affect until after a node is restarted Key: CASSANDRA-3746 URL: https://issues.apache.org/jira/browse/CASSANDRA-3746 Project: Cassandra Issue Type: Bug Reporter: B. Todd Burruss using CLI to update row_cache_provider does not take affect until after a node is restarted, even though describe shows it as set. not sure if you consider this a bug, but has caused me some grief with my testing of providers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3487) better repair session timeouts and retrys
[ https://issues.apache.org/jira/browse/CASSANDRA-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773415#comment-13773415 ] Yuki Morishita commented on CASSANDRA-3487: --- Added exception handling for validation to fail repair session in 2.0. We still don't have timeout, though I think we don't need one because when the node is choked, gossip/failure detector would kill repair session. better repair session timeouts and retrys - Key: CASSANDRA-3487 URL: https://issues.apache.org/jira/browse/CASSANDRA-3487 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Priority: Minor Fix For: 2.1 It would be great if we can timeout a validation compaction which is taking long or had an exception while doing a Validation. Repair can gossip its status to all the other nodes, hence any node which is waiting for response of a tree request to wait until it complete, if the repair is not going to complete because of exception or because it is too busy taking the incoming request we can timeout the user request. Bonus: By displaying the repair gossip via nodetool, user/script running the request can have a better handle on whats going on in the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1975) Multi NIC/IP support
[ https://issues.apache.org/jira/browse/CASSANDRA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1975. --- Resolution: Won't Fix Multi NIC/IP support Key: CASSANDRA-1975 URL: https://issues.apache.org/jira/browse/CASSANDRA-1975 Project: Cassandra Issue Type: Improvement Reporter: Max Sanders Priority: Minor Labels: ponies Every node should listen on more than one IP. When a node is contacted a random IP for that node is chosen. If the node can not be reached by this IP an other IP is used. This way you don't need multipathing switches or Distributed Split Multi-Link Trunking hardware that is very expensive. You could use normal hardware that is inexpensive. If a cable is plugged of or a switch failed it is no problem. Also you can use the second/third network for load balancing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3998) CLI: NUL character for data not visible
[ https://issues.apache.org/jira/browse/CASSANDRA-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773454#comment-13773454 ] Jonathan Ellis commented on CASSANDRA-3998: --- [~iamaleksey] is this relevant to cqlsh? CLI: NUL character for data not visible --- Key: CASSANDRA-3998 URL: https://issues.apache.org/jira/browse/CASSANDRA-3998 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Tyler Hobbs When using UTF8Type or AsciiType, if a column name or value is only 0x00 bytes, the CLI will not show any indication that data is there. Here's an example where the column value is 0x00: {noformat} [default@Foo] get Foo2['key']; = (column=a, value=, timestamp=1330925963085434) {noformat} I'm not sure what the best solution is, but the current behavior is deceptive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3813) Cassandra-cli doesn't give any useful error on index creation failure
[ https://issues.apache.org/jira/browse/CASSANDRA-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3813. --- Resolution: Won't Fix Cassandra-cli doesn't give any useful error on index creation failure - Key: CASSANDRA-3813 URL: https://issues.apache.org/jira/browse/CASSANDRA-3813 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.5 Environment: DSE 1.0.5 inside the Cassandra-cli. Still nothing useful even with --debug flag. Reporter: Eric Lubow Labels: cli [default@linkcurrent] update column family report_by_account_content with comparator='UTF8Type' and column_metadata = [ ... { column_name:'meta:account-id', validation_class:'UTF8Type',index_type:KEYS}, ... { column_name:'meta:filter-hash', validation_class:'UTF8Type',index_type:KEYS} ... ]; null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4043) RecentBloomFilterFalseRatio and RecentBloomFilterFalsePositives reset each other
[ https://issues.apache.org/jira/browse/CASSANDRA-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4043: -- Assignee: Tyler Hobbs RecentBloomFilterFalseRatio and RecentBloomFilterFalsePositives reset each other Key: CASSANDRA-4043 URL: https://issues.apache.org/jira/browse/CASSANDRA-4043 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Tyler Hobbs Assignee: Tyler Hobbs Priority: Trivial Labels: jmx If either of the ColumnFamily JMX attributes {{RecentBloomFilterFalseRatio}} or {{RecentBloomFilterFalsePositives}} are read, both are reset. This means if you try to read both attributes at the same time (like jconsole does, for example), one of them is guaranteed to be 0. The solution might be that we store a separate false positives counter for the ratio and the normal count and reset them separately. Some refactoring should be done at the same time so that the BloomFilterTracker calculates the false positive ratio itself instead of having DataTracker fetch both counters and calculate the ratio. On a related note, why does nodetool not use the Recent versions of the bloom filter metrics? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1956) Convert row cache to row+filter cache
[ https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay resolved CASSANDRA-1956. -- Resolution: Duplicate Yep, Closing this as it is duplicate to CASSANDRA-5357. Convert row cache to row+filter cache - Key: CASSANDRA-1956 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Vijay Priority: Minor Fix For: 2.1 Attachments: 0001-1956-cache-updates-v0.patch, 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch, 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch, 0002-add-query-cache.patch Changing the row cache to a row+filter cache would make it much more useful. We currently have to warn against using the row cache with wide rows, where the read pattern is typically a peek at the head, but this usecase would be perfect supported by a cache that stored only columns matching the filter. Possible implementations: * (copout) Cache a single filter per row, and leave the cache key as is * Cache a list of filters per row, leaving the cache key as is: this is likely to have some gotchas for weird usage patterns, and it requires the list overheard * Change the cache key to rowkey+filterid: basically ideal, but you need a secondary index to lookup cache entries by rowkey so that you can keep them in sync with the memtable * others? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3848) EmbeddedCassandraService needs a stop() method
[ https://issues.apache.org/jira/browse/CASSANDRA-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773443#comment-13773443 ] Jonathan Ellis commented on CASSANDRA-3848: --- Closing since nobody cares enough to submit a new patch. EmbeddedCassandraService needs a stop() method -- Key: CASSANDRA-3848 URL: https://issues.apache.org/jira/browse/CASSANDRA-3848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: David Hawthorne Priority: Trivial I just need a stop() method in EmbeddedCassandraService so I can shut it down as part of my unit tests, so I can test fail behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4075) Dropped keyspaces and cfs do not get deleted
[ https://issues.apache.org/jira/browse/CASSANDRA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4075. --- Resolution: Cannot Reproduce Dropped keyspaces and cfs do not get deleted Key: CASSANDRA-4075 URL: https://issues.apache.org/jira/browse/CASSANDRA-4075 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.1 Reporter: Joaquin Casares Labels: datastax_qa Tested in 0.8.10, reported in 0.8.1. Dropped keyspaces and column families have their sstables marked as Compacted, but will not disappear, even on restart. Worked correctly in 1.0.8 where the sstables get deleted almost immediately following the column family drop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4048) SSTableLoader meets problem during the apply of the schema update
[ https://issues.apache.org/jira/browse/CASSANDRA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4048. --- Resolution: Cannot Reproduce SSTableLoader meets problem during the apply of the schema update --- Key: CASSANDRA-4048 URL: https://issues.apache.org/jira/browse/CASSANDRA-4048 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.8 Reporter: Zhu Han Priority: Minor SSTableLoader tries to apply the drop column family meta data update, and meets below problem. Seems like the column family is dropped multiple times? user@luzhou:/data/apache-cassandra-1.0.8$ bin/sstableloader -i hostA,hostB /tmp/out/store/ Starting client (and waiting 30 seconds for gossip) ... java.lang.IllegalArgumentException: Unknown CF 1000 at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:167) at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:160) at org.apache.cassandra.db.migration.DropColumnFamily.applyModels(DropColumnFamily.java:70) at org.apache.cassandra.db.migration.Migration.apply(Migration.java:156) at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:73) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) ERROR 18:35:51,423 Error in ThreadPoolExecutor java.lang.IllegalArgumentException: Unknown CF 1000 at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:167) at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:160) at org.apache.cassandra.db.migration.DropColumnFamily.applyModels(DropColumnFamily.java:70) at org.apache.cassandra.db.migration.Migration.apply(Migration.java:156) at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:73) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) ERROR 18:35:51,595 Error in ThreadPoolExecutor java.lang.IllegalArgumentException: Unknown CF 1000 at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:167) at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:160) at org.apache.cassandra.db.migration.DropColumnFamily.applyModels(DropColumnFamily.java:70) at org.apache.cassandra.db.migration.Migration.apply(Migration.java:156) at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:73) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4053) IncomingTcpConnection can not be closed when the peer is brutaly terminated or switch is failed
[ https://issues.apache.org/jira/browse/CASSANDRA-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773458#comment-13773458 ] Jonathan Ellis commented on CASSANDRA-4053: --- Is this still relevant [~krummas]? IncomingTcpConnection can not be closed when the peer is brutaly terminated or switch is failed --- Key: CASSANDRA-4053 URL: https://issues.apache.org/jira/browse/CASSANDRA-4053 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Zhu Han Assignee: Marcus Eriksson IncomingTcpConnection has no way to detect the peer is down when the peer meets power loss or the network infrastructure is failed, and the thread is leaked... For safety, as least SO_KEEPALIVE should be set on those IncomingTcpConnections. The better way is to close the incoming connections when failure detector notifies the peer failure, but it requires some extra bookmarking. Besides it, it would be better if IncomingTcpConnection and OutgoingTcpConnection is marked as daemon thread... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4053) IncomingTcpConnection can not be closed when the peer is brutaly terminated or switch is failed
[ https://issues.apache.org/jira/browse/CASSANDRA-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4053: -- Assignee: Marcus Eriksson IncomingTcpConnection can not be closed when the peer is brutaly terminated or switch is failed --- Key: CASSANDRA-4053 URL: https://issues.apache.org/jira/browse/CASSANDRA-4053 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Zhu Han Assignee: Marcus Eriksson IncomingTcpConnection has no way to detect the peer is down when the peer meets power loss or the network infrastructure is failed, and the thread is leaked... For safety, as least SO_KEEPALIVE should be set on those IncomingTcpConnections. The better way is to close the incoming connections when failure detector notifies the peer failure, but it requires some extra bookmarking. Besides it, it would be better if IncomingTcpConnection and OutgoingTcpConnection is marked as daemon thread... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3848) EmbeddedCassandraService needs a stop() method
[ https://issues.apache.org/jira/browse/CASSANDRA-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3848. --- Resolution: Won't Fix EmbeddedCassandraService needs a stop() method -- Key: CASSANDRA-3848 URL: https://issues.apache.org/jira/browse/CASSANDRA-3848 Project: Cassandra Issue Type: Improvement Components: Core Reporter: David Hawthorne Priority: Trivial I just need a stop() method in EmbeddedCassandraService so I can shut it down as part of my unit tests, so I can test fail behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4102) Upgrade to Jackson 2
[ https://issues.apache.org/jira/browse/CASSANDRA-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773462#comment-13773462 ] Jonathan Ellis commented on CASSANDRA-4102: --- Avro is gone now. Upgrade to Jackson 2 Key: CASSANDRA-4102 URL: https://issues.apache.org/jira/browse/CASSANDRA-4102 Project: Cassandra Issue Type: Bug Reporter: Ben McCann Priority: Minor Cassandra is currently using Jackson 1.4.0. It would be nice to upgrade to Jackson 2, which is a smaller, lighter, and more modular library. I'm using Play Framework and SBT, which complain vociferously about Jackson 1 not having its javadoc jars in the Maven repository. Upgrading to Jackson 2 would fix this annoyance. Files using Jackson are: src/java/org/apache/cassandra/utils/FBUtilities.java src/java/org/apache/cassandra/tools/SSTableExport.java src/java/org/apache/cassandra/db/compaction/LeveledManifest.java Info on Jackson 2 is available on Github and the wiki: https://github.com/FasterXML/jackson-core http://wiki.fasterxml.com/JacksonRelease20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4105) Use WritableComparable / Writable in RecordReader
[ https://issues.apache.org/jira/browse/CASSANDRA-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4105. --- Resolution: Won't Fix Use WritableComparable / Writable in RecordReader - Key: CASSANDRA-4105 URL: https://issues.apache.org/jira/browse/CASSANDRA-4105 Project: Cassandra Issue Type: Wish Components: Hadoop Affects Versions: 0.8.11, 1.0.9, 1.1.1, 1.2.0 beta 1 Reporter: Patrik Modesto Cassandra uses ByteByffer/ListMutation as key/value in RecordWriter. This prevents the use of MultipleOutputs class that requires a key to be WritableComparable and value to be Writable. MultipleOutputs is a very handy class that provides a way to write to several differrent OutputFormats from a reducer. In our case I have a mapreduce job that produces two results which I need to write to Cassandra and to a file respectively and as for now I need to run that mapreduce twice which is quite expensive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3825) hung upon nodetool cleanup, kill
[ https://issues.apache.org/jira/browse/CASSANDRA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3825. --- Resolution: Cannot Reproduce hung upon nodetool cleanup, kill Key: CASSANDRA-3825 URL: https://issues.apache.org/jira/browse/CASSANDRA-3825 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.7 Reporter: Bayle Shanks Priority: Minor i did bin/nodetool -h localhost repair bin/nodetool -h localhost compact which terminated. Then i did bin/nodetool -h localhost cleanup which did not. CPU usage was close to zero. I left it running overnight but it didn't terminate. Cassandra was running with -f on screen, so i logged in and typed cntl-C. Cassandra said that it stopped listening to thrift clients but it did not then shut down. Additional cntl-Cs didn't do anything. i had to kill -9 the process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3998) CLI: NUL character for data not visible
[ https://issues.apache.org/jira/browse/CASSANDRA-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773463#comment-13773463 ] Aleksey Yeschenko commented on CASSANDRA-3998: -- {noformat} cqlsh:test insert into test2(id, val) VALUES ( 0, blobAsascii(0x004600)); cqlsh:test select * from test2; id | val +--- 0 | \x00F\x00 (1 rows) {noformat} No. CLI: NUL character for data not visible --- Key: CASSANDRA-3998 URL: https://issues.apache.org/jira/browse/CASSANDRA-3998 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Tyler Hobbs When using UTF8Type or AsciiType, if a column name or value is only 0x00 bytes, the CLI will not show any indication that data is there. Here's an example where the column value is 0x00: {noformat} [default@Foo] get Foo2['key']; = (column=a, value=, timestamp=1330925963085434) {noformat} I'm not sure what the best solution is, but the current behavior is deceptive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3697) 'nodetool -h localhost repair' fails when there is an empty ks
[ https://issues.apache.org/jira/browse/CASSANDRA-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3697. --- Resolution: Not A Problem 'nodetool -h localhost repair' fails when there is an empty ks -- Key: CASSANDRA-3697 URL: https://issues.apache.org/jira/browse/CASSANDRA-3697 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.6 Reporter: Joaquin Casares Priority: Minor Attachments: 3697.diff If there is an empty KS, the assertion error thrown kills the entire repair, when a keyspace is not specified. To replicate: Start a new cluster nodetool -h localhost repair Create a new keyspace with no column families nodetool -h localhost repair Assertion error is thrown to the prompt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4134) Do not send hints before a node is fully up
[ https://issues.apache.org/jira/browse/CASSANDRA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4134. --- Resolution: Cannot Reproduce Do not send hints before a node is fully up --- Key: CASSANDRA-4134 URL: https://issues.apache.org/jira/browse/CASSANDRA-4134 Project: Cassandra Issue Type: Bug Reporter: Joaquin Casares Priority: Minor After seeing this on a cluster and working with Pavel, we have seen the following errors disappear after all migrations have been applied: {noformat} ERROR [MutationStage:1] 2012-04-09 18:16:00,240 RowMutationVerbHandler.java (line 61) Error in row mutation org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1028 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:129) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409) at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:357) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) and ERROR [ReadStage:69] 2012-04-09 18:16:01,715 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReadStage:69,5,main] java.lang.IllegalArgumentException: Unknown ColumnFamily content_indexes in keyspace linkcurrent at org.apache.cassandra.config.Schema.getComparator(Schema.java:223) at org.apache.cassandra.db.ColumnFamily.getComparatorFor(ColumnFamily.java:300) at org.apache.cassandra.db.ReadCommand.getComparator(ReadCommand.java:92) at org.apache.cassandra.db.SliceByNamesReadCommand.init(SliceByNamesReadCommand.java:44) at org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:106) at org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:74) at org.apache.cassandra.db.ReadCommandSerializer.deserialize(ReadCommand.java:132) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} It seems as though as soon as the correct Migration is applied, the Hints are accepted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name
[ https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773133#comment-13773133 ] Jonathan Ellis commented on CASSANDRA-5202: --- CASSANDRA-6060 is related since it also contemplates changing CFID assignment (back to unique ints via CAS). CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name Key: CASSANDRA-5202 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.9 Environment: OS: Windows 7, Server: Cassandra 1.1.9 release drop Client: astyanax 1.56.21, JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27) Reporter: Marat Bedretdinov Assignee: Yuki Morishita Labels: test Fix For: 2.1 Attachments: 5202-1.1.txt, 5202-2.0.0.txt, astyanax-stress-driver.zip Attached is a driver that sequentially: 1. Drops keyspace 2. Creates keyspace 4. Creates 2 column families 5. Seeds 1M rows with 100 columns 6. Queries these 2 column families The above steps are repeated 1000 times. The following exception is observed at random (race - SEDA?): ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:55,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(62819832764241410631599989027761269388, 313a31) in C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception appears in the server at the time of client submitting a query request (row slice) and not at the time data is seeded. The client times out and this data can no longer be queried as the same exception would always occur from there on. Also on iteration 201, it appears that dropping column families failed and as a result their recreation failed with unique column family name violation (see exception below). Note that the data files are actually gone, so it appears that the server runtime responsible for creating column family was out of sync with the piece that dropped them: Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5105 ms Iteration: 200; Total running time for 1000 queries is 232; Average running time of 1000 queries is 0 ms Starting dropping column families Dropped column families Starting dropping keyspace Dropped keyspace Starting creating column families Created column families Starting seeding data Total rows inserted: 100 in 5361 ms Iteration: 201; Total running time for 1000 queries is 222; Average running time of 1000 queries is 0 ms Starting dropping column families Starting creating column families Exception in thread main com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), attempts=1]InvalidRequestException(why:Keyspace names must be case-insensitively unique (user_role_reverse_index conflicts with
[jira] [Commented] (CASSANDRA-2443) Stream key/row caches during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773437#comment-13773437 ] Yuki Morishita commented on CASSANDRA-2443: --- We can't just stream key cache because file names and position change, though it may be possible to build Bloom Filter of cached keys and stream it before files, and the bootstrapping node can build it's own key cache as it receives SSTables. Stream key/row caches during bootstrap -- Key: CASSANDRA-2443 URL: https://issues.apache.org/jira/browse/CASSANDRA-2443 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Priority: Minor Labels: ponies When adding new nodes to an existing cluster, if we streamed key and row caches over right before node came into cluster, we could minimize the impact of a cold node, and reduce the time for the node to get 'warmed' up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3916) Do not bind the storage_port if internode_encryption = all
[ https://issues.apache.org/jira/browse/CASSANDRA-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3916: -- Assignee: Dave Brosius Do not bind the storage_port if internode_encryption = all -- Key: CASSANDRA-3916 URL: https://issues.apache.org/jira/browse/CASSANDRA-3916 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.0.7 Environment: Any Reporter: Wade Poziombka Assignee: Dave Brosius We are highly security conscious and having additional clear text ports open are undesirable. I have modified locally to get around but it seems that this is a very trivial fix to only bind the clear text storage_port if the internode_encryption is not all. If all is selected then no clear text communication should be permitted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5791) A nodetool command to validate all sstables in a node
[ https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5791: -- Assignee: (was: Lyuben Todorov) Okay, sounds like we're feature creeping a bit here. CASSANDRA-4165 is open for adding a digest to compressed sstables. If you want a full scrub, then you should use scrub. :) A nodetool command to validate all sstables in a node - Key: CASSANDRA-5791 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Priority: Minor Fix For: 1.2.11 CUrrently there is no nodetool command to validate all sstables on disk. The only way to do this is to run a repair and see if it succeeds. But we cannot repair the system keyspace. Also we can run upgrade sstables but that re writes all the sstables. This command should check the hash of all sstables and return whether all data is readable all not. This should NOT care about consistency. The compressed sstables do not have hash so not sure how it will work there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3889) malformed UPDATE queries causing NumberFormatException, AssertionError
[ https://issues.apache.org/jira/browse/CASSANDRA-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3889. --- Resolution: Won't Fix malformed UPDATE queries causing NumberFormatException, AssertionError -- Key: CASSANDRA-3889 URL: https://issues.apache.org/jira/browse/CASSANDRA-3889 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: Cassandra 1.1 Reporter: paul cannon Priority: Minor Labels: cql Given a columnfamily like: {code} CREATE TABLE CounterCF (KEY text PRIMARY KEY, count_me counter) WITH comparator = ascii AND default_validation = counter; {code} This query causes Cassandra to throw a NumberFormatException and close the thrift connection (yes, it's not valid, cause it's a counter CF) {code} UPDATE CounterCF SET count_me = count_me + 2, x = 'a' WHERE key = 'counter1'; {code} And this variant causes an AssertionError and a permanently unresponsive thrift connection: {code} update CounterCF set count_me=count_me+2, x='' where key = 'counter1'; {code} When a valid hex string (with a multiple of 2 hex digits) is used instead of 'a' or '', then the expected InvalidRequestException is seen. This is related to CASSANDRA-2851, but seems unimportant and incidental enough that a new ticket is more appropriate than reopening that one. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3487) better repair session timeouts and retrys
[ https://issues.apache.org/jira/browse/CASSANDRA-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3487. --- Resolution: Won't Fix better repair session timeouts and retrys - Key: CASSANDRA-3487 URL: https://issues.apache.org/jira/browse/CASSANDRA-3487 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vijay Priority: Minor Fix For: 2.1 It would be great if we can timeout a validation compaction which is taking long or had an exception while doing a Validation. Repair can gossip its status to all the other nodes, hence any node which is waiting for response of a tree request to wait until it complete, if the repair is not going to complete because of exception or because it is too busy taking the incoming request we can timeout the user request. Bonus: By displaying the repair gossip via nodetool, user/script running the request can have a better handle on whats going on in the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3998) CLI: NUL character for data not visible
[ https://issues.apache.org/jira/browse/CASSANDRA-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3998. --- Resolution: Won't Fix CLI: NUL character for data not visible --- Key: CASSANDRA-3998 URL: https://issues.apache.org/jira/browse/CASSANDRA-3998 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.8 Reporter: Tyler Hobbs When using UTF8Type or AsciiType, if a column name or value is only 0x00 bytes, the CLI will not show any indication that data is there. Here's an example where the column value is 0x00: {noformat} [default@Foo] get Foo2['key']; = (column=a, value=, timestamp=1330925963085434) {noformat} I'm not sure what the best solution is, but the current behavior is deceptive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema
[ https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6071: -- Reviewer: brandon.williams CqlStorage loading compact table adds an extraneous field to the pig schema --- Key: CASSANDRA-6071 URL: https://issues.apache.org/jira/browse/CASSANDRA-6071 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Priority: Minor Fix For: 1.2.11 Attachments: 6071.txt {code} CREATE TABLE t ( key text, field1 int, field2 int PRIMARY KEY (key, field1) ) WITH COMPACT STORAGE; INSERT INTO t (key,field1,field2) VALUES ('key1',1,2); INSERT INTO t (key,field1,field2) VALUES ('key2',1,2); INSERT INTO t (key,field1,field2) VALUES ('key3',1,2); {code} {code} grunt t = LOAD 'cql://ks/t' USING CqlStorage(); grunt describe t; t: {key: chararray,field1: int,field2: int,value: int} dump t; (key1,1,2,) (key3,1,2,) (key2,1,2,) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5540) Concurrent secondary index updates remove rows from the index
[ https://issues.apache.org/jira/browse/CASSANDRA-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773309#comment-13773309 ] Jonathan Ellis commented on CASSANDRA-5540: --- No, affects 1.2.0+ as indicated. Concurrent secondary index updates remove rows from the index - Key: CASSANDRA-5540 URL: https://issues.apache.org/jira/browse/CASSANDRA-5540 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.0 Reporter: Alexei Bakanov Assignee: Sam Tunnicliffe Fix For: 1.2.5 Attachments: 0001-Use-different-index-updater-for-live-updates-compact.patch, 5540.txt Existing rows disappear from secondary index when doing simultaneous updates of a row with the same secondary index value. Here is a little pycassa script that reproduces a bug. The script inserts 4 rows with same secondary index value, reads those rows back and check that there are 4 of them. Please run two instances of the script simultaneously in two separate terminals in order to simulate concurrent updates: {code} -scrpit.py START- import pycassa from pycassa.index import * pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 'cf1') while True: for rowKey in xrange(4): cf.insert(str(rowKey), {'indexedColumn': 'indexedValue'}) index_expression = create_index_expression('indexedColumn', 'indexedValue') index_clause = create_index_clause([index_expression]) rows = cf.get_indexed_slices(index_clause) length = len(list(rows)) if length == 4: pass else: print 'found just %d rows out of 4' % length pool.dispose() ---script.py FINISH--- ---schema cli start--- create keyspace ks123 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 1} and durable_writes = true; use ks123; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and populate_io_cache_on_flush = false and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : AsciiType, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; ---schema cli finish--- {code} Test cluster created with 'ccm create --cassandra-version 1.2.4 --nodes 1 --start testUpdate' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5632) Cross-DC bandwidth-saving broken
[ https://issues.apache.org/jira/browse/CASSANDRA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773487#comment-13773487 ] Jeremy Hanna commented on CASSANDRA-5632: - I believe for the issue that was fixed here it originated in 1.2 and was present up through 1.2.5. Cross-DC bandwidth-saving broken Key: CASSANDRA-5632 URL: https://issues.apache.org/jira/browse/CASSANDRA-5632 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Jonathan Ellis Assignee: Jonathan Ellis Fix For: 1.2.6 Attachments: 5632.txt, 5632-v2.txt, cassandra-topology.properties, fix_patch_bug.log We group messages by destination as follows to avoid sending multiple messages to a remote datacenter: {code} // Multimap that holds onto all the messages and addresses meant for a specific datacenter MapString, MultimapMessage, InetAddress dcMessages {code} When we cleaned out the MessageProducer stuff for 2.0, this code {code} MultimapMessage, InetAddress messages = dcMessages.get(dc); ... messages.put(producer.getMessage(Gossiper.instance.getVersion(destination)), destination); {code} turned into {code} MultimapMessageOut, InetAddress messages = dcMessages.get(dc); ... messages.put(rm.createMessage(), destination); {code} Thus, we weren't actually grouping anything anymore -- each destination replica was stored under a separate Message key, unlike under the old CachingMessageProducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1632) Thread workflow and cpu affinity
[ https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773430#comment-13773430 ] Stu Hood commented on CASSANDRA-1632: - Regarding 1): ForkJoinPool implements per-worker queues with work-stealing to deal with this problem. Would be interesting to just drop ForkJoinPool in and see how it does. Thread workflow and cpu affinity Key: CASSANDRA-1632 URL: https://issues.apache.org/jira/browse/CASSANDRA-1632 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Jason Brown Here are some thoughts I wanted to write down, we need to run some serious benchmarks to see the benefits: 1) All thread pools for our stages use a shared queue per stage. For some stages we could move to a model where each thread has its own queue. This would reduce lock contention on the shared queue. This workload only suits the stages that have no variance, else you run into thread starvation. Some stages that this might work: ROW-MUTATION. 2) Set cpu affinity for each thread in each stage. If we can pin threads to specific cores, and control the workflow of a message from Thrift down to each stage, we should see improvements on reducing L1 cache misses. We would need to build a JNI extension (to set cpu affinity), as I could not find anywhere in JDK where it was exposed. 3) Batching the delivery of requests across stage boundaries. Peter Schuller hasn't looked deep enough yet into the JDK, but he thinks there may be significant improvements to be had there. Especially in high-throughput situations. If on each consumption you were to consume everything in the queue, rather than implying a synchronization point in between each request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-1632) Thread workflow and cpu affinity
[ https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown reassigned CASSANDRA-1632: -- Assignee: Jason Brown Thread workflow and cpu affinity Key: CASSANDRA-1632 URL: https://issues.apache.org/jira/browse/CASSANDRA-1632 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Jason Brown Here are some thoughts I wanted to write down, we need to run some serious benchmarks to see the benefits: 1) All thread pools for our stages use a shared queue per stage. For some stages we could move to a model where each thread has its own queue. This would reduce lock contention on the shared queue. This workload only suits the stages that have no variance, else you run into thread starvation. Some stages that this might work: ROW-MUTATION. 2) Set cpu affinity for each thread in each stage. If we can pin threads to specific cores, and control the workflow of a message from Thrift down to each stage, we should see improvements on reducing L1 cache misses. We would need to build a JNI extension (to set cpu affinity), as I could not find anywhere in JDK where it was exposed. 3) Batching the delivery of requests across stage boundaries. Peter Schuller hasn't looked deep enough yet into the JDK, but he thinks there may be significant improvements to be had there. Especially in high-throughput situations. If on each consumption you were to consume everything in the queue, rather than implying a synchronization point in between each request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4165) Generate Digest file for compressed SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4165: -- Attachment: 4165-rebased.txt We've had CASSANDRA-5791 come up as well, so I think it's reasonable to say that this is useful for more than just Spotify. Attached is Marcus's patch rebased to 2.0, but we may need to dig a little deeper; I'm not quite sure what the right treatment of the CRC component is, but it appears to duplicate the inline checksum that we compute for the compressed writer which seems odd. Git places the blame on CASSANDRA-3648. Any light to shed [~vijay2...@yahoo.com] [~jasobrown]? Generate Digest file for compressed SSTables Key: CASSANDRA-4165 URL: https://issues.apache.org/jira/browse/CASSANDRA-4165 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Priority: Minor Attachments: 0001-Generate-digest-for-compressed-files-as-well.patch, 4165-rebased.txt We use the generated *Digest.sha1-files to verify backups, would be nice if they were generated for compressed sstables as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4521) OutboundTcpConnection could drop outgoing messages and not log it.
[ https://issues.apache.org/jira/browse/CASSANDRA-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4521. --- Resolution: Duplicate OutboundTcpConnection could drop outgoing messages and not log it. --- Key: CASSANDRA-4521 URL: https://issues.apache.org/jira/browse/CASSANDRA-4521 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Environment: trunk Reporter: sankalp kohli Priority: Minor Labels: message Original Estimate: 0.1h Remaining Estimate: 0.1h Since there is one connection between two nodes and all writes are handled by single thread, there is a chance that a message gets old enough and is dropped in OutboundTcpConnection. These dropped message does not get logged by MessageService. We should definitely log these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4725) VERSION string conflict in C++ programs
[ https://issues.apache.org/jira/browse/CASSANDRA-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773545#comment-13773545 ] Jonathan Ellis commented on CASSANDRA-4725: --- Doesn't seem like it's worth breaking existing clients to fix this; new clients should use native protocol instead. VERSION string conflict in C++ programs --- Key: CASSANDRA-4725 URL: https://issues.apache.org/jira/browse/CASSANDRA-4725 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jochen Topf In cassandra.thrift there is a definition like this: const string VERSION = 19.32.0 When building the C++ code with thrift, this leads to a file cassandra_constants.h and cassandra_constants.cpp which contain the following lines: cassandra_constants.cpp: VERSION = 19.32.0; cassandra_constants.h: std::string VERSION; Unfortunately VERSION is all uppercase, this is generally used in macros in C++ and the macro VERSION is used in many programs for instance when using GNU autoconf. If there is a VERSION macro it will be expanded and those lines will break. Maybe we can rename this to Version or so? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4725) VERSION string conflict in C++ programs
[ https://issues.apache.org/jira/browse/CASSANDRA-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-4725. --- Resolution: Won't Fix VERSION string conflict in C++ programs --- Key: CASSANDRA-4725 URL: https://issues.apache.org/jira/browse/CASSANDRA-4725 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jochen Topf In cassandra.thrift there is a definition like this: const string VERSION = 19.32.0 When building the C++ code with thrift, this leads to a file cassandra_constants.h and cassandra_constants.cpp which contain the following lines: cassandra_constants.cpp: VERSION = 19.32.0; cassandra_constants.h: std::string VERSION; Unfortunately VERSION is all uppercase, this is generally used in macros in C++ and the macro VERSION is used in many programs for instance when using GNU autoconf. If there is a VERSION macro it will be expanded and those lines will break. Maybe we can rename this to Version or so? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4603) use Map internally in schema_ tables where appropriate
[ https://issues.apache.org/jira/browse/CASSANDRA-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773528#comment-13773528 ] Jonathan Ellis commented on CASSANDRA-4603: --- bq. we'd be able to remove them in 2.1 provided we formalize that 2.0 is a mandatory stop before upgrading to 2.1+. We've already signed up for that (CASSANDRA-5996). use Map internally in schema_ tables where appropriate -- Key: CASSANDRA-4603 URL: https://issues.apache.org/jira/browse/CASSANDRA-4603 Project: Cassandra Issue Type: Improvement Components: API, Core Affects Versions: 1.2.0 Reporter: Jonathan Ellis Priority: Minor Labels: cql3 Fix For: 2.1 {replication, compression, compaction}_parameters should be stored as Map type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-6072) key_alias can be null for tables created from thrift.
Jeremiah Jordan created CASSANDRA-6072: -- Summary: key_alias can be null for tables created from thrift. Key: CASSANDRA-6072 URL: https://issues.apache.org/jira/browse/CASSANDRA-6072 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Jeremiah Jordan key_alias can be null for tables created from thrift. Which causes an NPE here: https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java#L633 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-6073) Changes for Pig collections break CQL prepared statements
Chad Johnston created CASSANDRA-6073: Summary: Changes for Pig collections break CQL prepared statements Key: CASSANDRA-6073 URL: https://issues.apache.org/jira/browse/CASSANDRA-6073 Project: Cassandra Issue Type: Bug Components: Hadoop Environment: 1.2.10-tentative branch Reporter: Chad Johnston I've checked out and built the 1.2.10-tentative branch, and I've noticed that all of my CQL prepared statements are now broken. Looking into the code, it looks like the # - = and @ - ? translations were removed. I tried to replace these in one of my scripts with = and ?, but there's other code that splits the query string on =, causing the prepared statement to be malformed. If I look at the comments on https://issues.apache.org/jira/browse/CASSANDRA-5867, where this change was made, I see a single mention of URL encoding the CQL query. Is this the expectation going forward? Was there a reason that the # and @ mappings were removed? Further: I've tried URL encoding, and changing the CqlStorage code back to its previous behavior. I get the same error in this case of a long being a different size than expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4638) Patch to bin/cassandra to use 64bit JVM if available
[ https://issues.apache.org/jira/browse/CASSANDRA-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773530#comment-13773530 ] Jonathan Ellis commented on CASSANDRA-4638: --- Well, that didn't work very well did it, [~urandom]? Patch to bin/cassandra to use 64bit JVM if available Key: CASSANDRA-4638 URL: https://issues.apache.org/jira/browse/CASSANDRA-4638 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1.4 Environment: Tested on Solaris 11 with Oracle supplied JVM 1.6 and 1.7 Reporter: Bernhard Roth Labels: 64bit, linux, solaris Attachments: cassandra.patch Original Estimate: 0.25h Remaining Estimate: 0.25h Cassandra uses by default the JAVA binary at $JAVA_HOME/bin and complains at start that the 64bit version should be used. Well, even if the 64bit JAVA version is installed, cassandra still does not use it. Attached patch solves this problem by checking if $JAVA_HOME/bin/amd64/java binary exists. If yes, it will be used for cassandra. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4638) Patch to bin/cassandra to use 64bit JVM if available
[ https://issues.apache.org/jira/browse/CASSANDRA-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4638: -- Reviewer: urandom (was: eevans) Patch to bin/cassandra to use 64bit JVM if available Key: CASSANDRA-4638 URL: https://issues.apache.org/jira/browse/CASSANDRA-4638 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1.4 Environment: Tested on Solaris 11 with Oracle supplied JVM 1.6 and 1.7 Reporter: Bernhard Roth Labels: 64bit, linux, solaris Attachments: cassandra.patch Original Estimate: 0.25h Remaining Estimate: 0.25h Cassandra uses by default the JAVA binary at $JAVA_HOME/bin and complains at start that the 64bit version should be used. Well, even if the 64bit JAVA version is installed, cassandra still does not use it. Attached patch solves this problem by checking if $JAVA_HOME/bin/amd64/java binary exists. If yes, it will be used for cassandra. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira