[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932952#comment-13932952 ] Otto Chrons commented on CASSANDRA-6587: We are experiencing the same problem with Cassandra 2.0.6 Funnily enough, recreating (DROP / CREATE) the index helped with the performance where rebuilding the index nor repairing the cf using nodetool didn't make any difference. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > index 44a1e64..0228c3a 100644 > --
[jira] [Created] (CASSANDRA-6851) Improve anticompaction after incremental repair
Marcus Eriksson created CASSANDRA-6851: -- Summary: Improve anticompaction after incremental repair Key: CASSANDRA-6851 URL: https://issues.apache.org/jira/browse/CASSANDRA-6851 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Priority: Minor Fix For: 2.1 beta2 After an incremental repair we iterate over all sstables and split them in two parts, one containing the repaired data and one the unrepaired. We could in theory double the number of sstables on a node. To avoid this we could make anticompaction also do a compaction, for example, if we are to anticompact 10 sstables, we could anticompact those to 2. Note that we need to avoid creating too big sstables though, if we anticompact all sstables on a node it would essentially be a major compaction. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6847) The binary transport doesn't load truststore file
[ https://issues.apache.org/jira/browse/CASSANDRA-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932991#comment-13932991 ] Sylvain Lebresne commented on CASSANDRA-6847: - bq. I have no recollection of why we bailed on the trust store for the native protocol. I suspect that this has to do with the fact that CASSANDRA-5120, which adds the require_client_auth for client connections, is newer than CASSANDRA-5031. > The binary transport doesn't load truststore file > - > > Key: CASSANDRA-6847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6847 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Stepura >Priority: Minor > Labels: ssl > > {code:title=org.apache.cassandra.transport.Server.SecurePipelineFactory} > this.sslContext = SSLFactory.createSSLContext(encryptionOptions, false); > {code} > {{false}} there means that {{truststore}} file won't be loaded in any case. > And that means that the file will not be used to validate clients when > {{require_client_auth==true}}, making > http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureNewTrustedUsers_t.html > meaningless. > The only way to workaround that currently is to start C* with > {{-Djavax.net.ssl.trustStore=conf/.truststore}} > I believe we should load {{truststore}} when {{require_client_auth==true}}, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6774) Cleanup fails with assertion error after stopping previous run
[ https://issues.apache.org/jira/browse/CASSANDRA-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-6774: --- Attachment: 0001-Dont-continue-after-failing-to-cancel-in-progress-co.patch > Cleanup fails with assertion error after stopping previous run > -- > > Key: CASSANDRA-6774 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6774 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: 2.0.5 >Reporter: Keith Wright >Assignee: Marcus Eriksson > Fix For: 2.0.7 > > Attachments: > 0001-Dont-continue-after-failing-to-cancel-in-progress-co.patch > > > I am stress testing a new 2.0.5 cluster and did the following: > - start decommission during heavy write, moderate read load > - trigger cleanup on non-decommissioning node (nodetool cleanup) > - Started to see higher GC load stop stopped cleanup via nodetool stop CLEANUP > - attempt to launch cleanup now fails with the following message in console. > Cassandra log shows: > WARN 18:45:38,420 Unable to cancel in-progress compactions for > shard_user_lookup. Probably there is an unusually large row in progress > somewhere. It is also possible that buggy code left some sstables compacting > after it was done with them > Exception in thread "main" java.lang.AssertionError: > [SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6129-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6938-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6698-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6884-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6854-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6898-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6522-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6692-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6815-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6677-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6917-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6929-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-7048-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6911-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6876-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-7046-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6762-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6712-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6906-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6886-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-7053-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6696-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6964-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-7043-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6983-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6672-Data.db'), > > SSTableReader(path='/data/2/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6714-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6992-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_lookup-jb-6234-Data.db'), > > SSTableReader(path='/data/1/cassandra/data/users/shard_user_lookup/users-shard_user_looku
[jira] [Commented] (CASSANDRA-6745) Require specifying rows_per_partition_to_cache
[ https://issues.apache.org/jira/browse/CASSANDRA-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933005#comment-13933005 ] Marcus Eriksson commented on CASSANDRA-6745: +1 > Require specifying rows_per_partition_to_cache > -- > > Key: CASSANDRA-6745 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6745 > Project: Cassandra > Issue Type: Bug >Reporter: Jonathan Ellis >Assignee: Marcus Eriksson >Priority: Trivial > Fix For: 2.1 beta2 > > Attachments: 0001-wip-caching-options.patch > > > We should require specifying rows_to_cache_per_partition for new tables or > newly ALTERed when row caching is enabled. > Pre-upgrade should be grandfathered in as ALL to match existing semantics. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Move ByteBuffer functions to ByteBufferUtil and avoid duplication
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 3e2c61057 -> 8a52f5af4 Move ByteBuffer functions to ByteBufferUtil and avoid duplication Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8a52f5af Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8a52f5af Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8a52f5af Branch: refs/heads/cassandra-2.1 Commit: 8a52f5af4f97a9a3062fca2db914ad2fe7e93162 Parents: 3e2c610 Author: Sylvain Lebresne Authored: Thu Mar 13 09:11:44 2014 +0100 Committer: Sylvain Lebresne Committed: Thu Mar 13 09:12:13 2014 +0100 -- .../db/composites/AbstractComposite.java| 5 +- .../db/marshal/AbstractCompositeType.java | 63 .../cassandra/db/marshal/CollectionType.java| 6 -- .../cassandra/db/marshal/CompositeType.java | 12 ++-- .../db/marshal/DynamicCompositeType.java| 14 ++--- .../serializers/CollectionSerializer.java | 6 -- .../cassandra/serializers/ListSerializer.java | 9 ++- .../cassandra/serializers/MapSerializer.java| 17 ++ .../cassandra/serializers/SetSerializer.java| 9 ++- .../apache/cassandra/utils/ByteBufferUtil.java | 37 10 files changed, 80 insertions(+), 98 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8a52f5af/src/java/org/apache/cassandra/db/composites/AbstractComposite.java -- diff --git a/src/java/org/apache/cassandra/db/composites/AbstractComposite.java b/src/java/org/apache/cassandra/db/composites/AbstractComposite.java index fbff930..9741767 100644 --- a/src/java/org/apache/cassandra/db/composites/AbstractComposite.java +++ b/src/java/org/apache/cassandra/db/composites/AbstractComposite.java @@ -22,6 +22,7 @@ import java.nio.ByteBuffer; import org.apache.cassandra.db.filter.ColumnSlice; import org.apache.cassandra.db.marshal.AbstractCompositeType; import org.apache.cassandra.db.marshal.CompositeType; +import org.apache.cassandra.utils.ByteBufferUtil; public abstract class AbstractComposite implements Composite { @@ -75,12 +76,12 @@ public abstract class AbstractComposite implements Composite // See org.apache.cassandra.db.marshal.CompositeType for details. ByteBuffer result = ByteBuffer.allocate(dataSize() + 3 * size() + (isStatic() ? 2 : 0)); if (isStatic()) -AbstractCompositeType.putShortLength(result, CompositeType.STATIC_MARKER); +ByteBufferUtil.writeShortLength(result, CompositeType.STATIC_MARKER); for (int i = 0; i < size(); i++) { ByteBuffer bb = get(i); -AbstractCompositeType.putShortLength(result, bb.remaining()); +ByteBufferUtil.writeShortLength(result, bb.remaining()); result.put(bb.duplicate()); result.put((byte)0); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/8a52f5af/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java -- diff --git a/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java b/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java index 236abc7..8f3aec4 100644 --- a/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java +++ b/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java @@ -17,15 +17,16 @@ */ package org.apache.cassandra.db.marshal; -import org.apache.cassandra.serializers.TypeSerializer; -import org.apache.cassandra.serializers.BytesSerializer; -import org.apache.cassandra.serializers.MarshalException; - import java.nio.ByteBuffer; import java.util.ArrayList; import java.util.Collections; import java.util.List; +import org.apache.cassandra.serializers.TypeSerializer; +import org.apache.cassandra.serializers.BytesSerializer; +import org.apache.cassandra.serializers.MarshalException; +import org.apache.cassandra.utils.ByteBufferUtil; + /** * A class avoiding class duplication between CompositeType and * DynamicCompositeType. @@ -34,44 +35,6 @@ import java.util.List; */ public abstract class AbstractCompositeType extends AbstractType { - -// changes bb position -public static int getShortLength(ByteBuffer bb) -{ -int length = (bb.get() & 0xFF) << 8; -return length | (bb.get() & 0xFF); -} - -// Doesn't change bb position -protected static int getShortLength(ByteBuffer bb, int position) -{ -int length = (bb.get(position) & 0xFF) << 8; -return length | (bb.get(position + 1) & 0xFF); -} - -// changes bb position -public static void putShortLength(ByteBuffer bb, int length) -{ -bb.put((byte) ((leng
[jira] [Updated] (CASSANDRA-6793) NPE in Hadoop Word count example
[ https://issues.apache.org/jira/browse/CASSANDRA-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chander S Pechetty updated CASSANDRA-6793: -- Reproduced In: 2.1 beta1, 2.0.0 (was: 2.0.0, 2.1 beta1) Attachment: trunk-6793-v2.txt patch v2 addresses the following: * Simplify the schema for both input and output tables. Traditionally word count example uses a line as input, so input can just be changed to {noformat} ( id uuid, line text, PRIMARY KEY (id)) {noformat} . Removed category, sub category, and title from input table and using UUID instead and a line to represent a line of text. Changed output schema to {noformat}(word text primary key, count int) {noformat} as suggested earlier. * Remove toString method and printing as it adds to unnecessary clutter in the mapper. * Remove the filter clauses as its not relevant to the word count example However I still see thrift interfaces in WordCountSetup class and runtime dependencies on cassandra in the bin folder. I didn't go into details, but can someone shed some light on the importance of having this. I think having clearly defined client API's/dependencies will be useful for the end user. > NPE in Hadoop Word count example > > > Key: CASSANDRA-6793 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6793 > Project: Cassandra > Issue Type: Bug > Components: Examples >Reporter: Chander S Pechetty >Assignee: Chander S Pechetty >Priority: Minor > Labels: hadoop > Attachments: trunk-6793-v2.txt, trunk-6793.txt > > > The partition keys requested in WordCount.java do not match the primary key > set up in the table output_words. It looks this patch was not merged properly > from > [CASSANDRA-5622|https://issues.apache.org/jira/browse/CASSANDRA-5622].The > attached patch addresses the NPE and uses the correct keys defined in #5622. > I am assuming there is no need to fix the actual NPE like throwing an > InvalidRequestException back to user to fix the partition keys, as it would > be trivial to get the same from the TableMetadata using the driver API. > java.lang.NullPointerException > at > org.apache.cassandra.dht.Murmur3Partitioner.getToken(Murmur3Partitioner.java:92) > at > org.apache.cassandra.dht.Murmur3Partitioner.getToken(Murmur3Partitioner.java:40) > at org.apache.cassandra.client.RingCache.getRange(RingCache.java:117) > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter.write(CqlRecordWriter.java:163) > at > org.apache.cassandra.hadoop.cql3.CqlRecordWriter.write(CqlRecordWriter.java:63) > at > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at WordCount$ReducerToCassandra.reduce(Unknown Source) > at WordCount$ReducerToCassandra.reduce(Unknown Source) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6783) Collections should have a proper compare() method for UDT
[ https://issues.apache.org/jira/browse/CASSANDRA-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-6783: Attachment: 6783-2.txt I've committed the BB methods refactor separatly so attaching v2 that only do what this ticket is about. It fix the typo from above and adds a unit test too. > Collections should have a proper compare() method for UDT > - > > Key: CASSANDRA-6783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6783 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 2.1 beta2 > > Attachments: 6783-2.txt, 6783.txt > > > So far, ListType, SetType and MapType don't have a proper implementation of > compare() (they throw UnsupportedOperationException) because we haven't need > one since as far as the cell comparator is concenred, only parts of a > collection ends up in the comparator and need to be compared, but the full > collection itself does not. > But with UDT can nest a collection and that sometimes require to be able to > compare them. Typically, I pushed a dtest > [here|https://github.com/riptano/cassandra-dtest/commit/290e9496d1b2c45158c7d7f5487d09ba48897a7f] > that ends up throwing: > {noformat} > java.lang.UnsupportedOperationException: CollectionType should not be use > directly as a comparator > at > org.apache.cassandra.db.marshal.CollectionType.compare(CollectionType.java:72) > ~[main/:na] > at > org.apache.cassandra.db.marshal.CollectionType.compare(CollectionType.java:37) > ~[main/:na] > at > org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:174) > ~[main/:na] > at > org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:101) > ~[main/:na] > at > org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) > ~[main/:na] > at java.util.TreeMap.compare(TreeMap.java:1188) ~[na:1.7.0_45] > at java.util.TreeMap.put(TreeMap.java:531) ~[na:1.7.0_45] > at java.util.TreeSet.add(TreeSet.java:255) ~[na:1.7.0_45] > at org.apache.cassandra.cql3.Sets$DelayedValue.bind(Sets.java:205) > ~[main/:na] > at org.apache.cassandra.cql3.Sets$Literal.prepare(Sets.java:91) > ~[main/:na] > at > org.apache.cassandra.cql3.UserTypes$Literal.prepare(UserTypes.java:60) > ~[main/:na] > at > org.apache.cassandra.cql3.Operation$SetElement.prepare(Operation.java:221) > ~[main/:na] > at > org.apache.cassandra.cql3.statements.UpdateStatement$ParsedUpdate.prepareInternal(UpdateStatement.java:201) > ~[main/:na] > ... > {noformat} > Note that this stack doesn't involve cell name comparison at all, it's just > that CQL3 sometimes uses a SortedSet underneath to deal with set literals > (since internal sets are sorted by their value), and so when a set contains > UDT that has set themselves, we need the collection comparison. That being > said, for some cases like having a UDT as a map key, we do would need > collections to be comparable for the purpose of cell name comparison. > Attaching relatively simple patch. The patch is a bit bigger than it should > be because while adding the 3 simple compare() method, I realized that we had > methods to read a short length (2 unsigned short) from a ByteBuffer > duplicated all over the place and that it was time to consolidate that in > ByteBufferUtil where it should have been from day one (thus removing the > duplication). I can separate that trivial refactor in a separate patch if we > really need to, but really, the new stuff is the compare() method > implementation in ListType, SetType and MapType and the rest is a bit of > trivial cleanup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6436) AbstractColumnFamilyInputFormat does not use start and end tokens configured via ConfigHelper.setInputRange()
[ https://issues.apache.org/jira/browse/CASSANDRA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933076#comment-13933076 ] Piotr Kołaczkowski commented on CASSANDRA-6436: --- Yeah, sure. > AbstractColumnFamilyInputFormat does not use start and end tokens configured > via ConfigHelper.setInputRange() > - > > Key: CASSANDRA-6436 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6436 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Paulo Ricardo Motta Gomes > Labels: hadoop, patch > Attachments: cassandra-1.2-6436.txt, cassandra-1.2-6436.txt > > > ConfigHelper allows to set a token input range via the setInputRange(conf, > startToken, endToken) call (ConfigHelper:254). > We used this feature to limit a hadoop job range to a single Cassandra node's > range, or even to single row key, mostly for testing purposes. > This worked before the fix for CASSANDRA-5536 > (https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9), > but after this ColumnFamilyInputFormat never uses the value of > KeyRange.start_token when defining the input splits > (AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which > needs an order preserving partitioner to work. > I propose the attached fix in order to allow defining Cassandra token ranges > for a given Hadoop job even when using a non-order preserving partitioner. > Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to > limit the range to a single Cassandra Key with RandomPartitioner: > IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration()); > Token token = part.getToken(ByteBufferUtil.bytes("Cassandra Key")); > BigInteger endToken = (BigInteger) new > BigIntegerConverter().convert(BigInteger.class, > part.getTokenFactory().toString(token)); > BigInteger startToken = endToken.subtract(new BigInteger("1")); > ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), > endToken.toString()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6436) AbstractColumnFamilyInputFormat does not use start and end tokens configured via ConfigHelper.setInputRange()
[ https://issues.apache.org/jira/browse/CASSANDRA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-6436: -- Reviewer: Piotr Kołaczkowski > AbstractColumnFamilyInputFormat does not use start and end tokens configured > via ConfigHelper.setInputRange() > - > > Key: CASSANDRA-6436 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6436 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Paulo Ricardo Motta Gomes > Labels: hadoop, patch > Attachments: cassandra-1.2-6436.txt, cassandra-1.2-6436.txt > > > ConfigHelper allows to set a token input range via the setInputRange(conf, > startToken, endToken) call (ConfigHelper:254). > We used this feature to limit a hadoop job range to a single Cassandra node's > range, or even to single row key, mostly for testing purposes. > This worked before the fix for CASSANDRA-5536 > (https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9), > but after this ColumnFamilyInputFormat never uses the value of > KeyRange.start_token when defining the input splits > (AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which > needs an order preserving partitioner to work. > I propose the attached fix in order to allow defining Cassandra token ranges > for a given Hadoop job even when using a non-order preserving partitioner. > Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to > limit the range to a single Cassandra Key with RandomPartitioner: > IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration()); > Token token = part.getToken(ByteBufferUtil.bytes("Cassandra Key")); > BigInteger endToken = (BigInteger) new > BigIntegerConverter().convert(BigInteger.class, > part.getTokenFactory().toString(token)); > BigInteger startToken = endToken.subtract(new BigInteger("1")); > ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), > endToken.toString()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed
[ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933089#comment-13933089 ] Benedict commented on CASSANDRA-6746: - bq. I get that, but the reason this was introduced in the first place was because the default behavior of (at least some) Linux kernels was to evict other data in favor of the newly flushed. Do you have a reference for that discussion? I couldn't find it searching JIRA. Whilst my research can't rule this out as a possibility, it seems as though it would be unlikely. The age of the recently written data would be low, certainly lower than any hot data, so that once it is actually synced to disk it is likely to be in the inactive_clean list and free for reclaim. It's possible that the non-trickle-fsync default interplays badly with this, with us permitting the entire sstable to hit the page cache and evict everything else whilst the OS catches up. But without that scenario I would be really surprised to see this behaviour of keeping written once pages over hotly read data. Either way, in the scenario that we are compacting hot data (probably more likely, since amount of compaction performed to data should decline with age, so we'll be mostly compacting younger data) the current behaviour is the worst possible scenario, with the apparently still going strong 2.6 (and possibly later) kernels definitely trashing the hot cache. So I think unless we detect the kernel version and set the default based on the known better behaviour of DONTNEED, it seems this is the better default to me. But we could perhaps change the defaults for trickle fsync as well (say, set it to true and 100MB by default) so that the OS has plenty of opportunity to reclaim the pages we're writing if it needs to. > Reads have a slow ramp up in speed > -- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ryan McGuire >Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, > cassandra-2.0-bdplab-trial-fincore.tar.bz2, > cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. > The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data > here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-6852) can't repair -pr part of data when not replicating data everywhere (multiDCs)
Cyril Scetbon created CASSANDRA-6852: Summary: can't repair -pr part of data when not replicating data everywhere (multiDCs) Key: CASSANDRA-6852 URL: https://issues.apache.org/jira/browse/CASSANDRA-6852 Project: Cassandra Issue Type: Bug Components: Core Reporter: Cyril Scetbon Our environment is as follows : - 3 DCS : dc1,dc2 and dc3 - replicate all keyspaces to dc1 and dc2 - replicate a few keyspaces to dc3 as we have less hardware and use it for computing statistics We use repair -pr everywhere regularly. FYI, a full repair takes almost 20 hours per node. The matter is that we can't use "repair -pr" anymore for tokens stored on dc3 concerning keyspaces not replicated. We should have a way to repair those ranges without doing a FULL REPAIR everywhere -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933113#comment-13933113 ] Benedict commented on CASSANDRA-6848: - Can you reproduce this consistently? Does this only occur on the remote cluster, or also when you run locally with ccm? > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-6848: Fix Version/s: 2.1 beta2 > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6835) cassandra-stress should support a variable number of counter columns
[ https://issues.apache.org/jira/browse/CASSANDRA-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933147#comment-13933147 ] Benedict commented on CASSANDRA-6835: - Looks good. I made a couple of changes (incl. fixing one of the CQL2 removes that predates this, where the wrong branch was selected) and pushed [here|https://github.com/belliottsmith/cassandra/tree/iss-6835-trunk] > cassandra-stress should support a variable number of counter columns > > > Key: CASSANDRA-6835 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6835 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Fix For: 2.1 beta2 > > Attachments: CASSANDRA-6835-merge-with-trunk.patch > > > Can you review, [~xedin]? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6841) ConcurrentModificationException in commit-log-writer after local schema reset
[ https://issues.apache.org/jira/browse/CASSANDRA-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933183#comment-13933183 ] Benedict commented on CASSANDRA-6841: - Updated the tree. Also converted one of those ISE to an IAE check prior to modifying the map > ConcurrentModificationException in commit-log-writer after local schema reset > - > > Key: CASSANDRA-6841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6841 > Project: Cassandra > Issue Type: Bug > Environment: Linux 3.2.0 (Debian Wheezy) Cassandra 2.0.6, Oracle JVM > 1.7.0_51 > Almost default cassandra.yaml (IPs and cluster name changed) > This is the 2nd node in a 2-node ring. It has ~2500 keyspaces and very low > traffic. (Only new keyspaces see reads and writes.) >Reporter: Pas >Assignee: Benedict >Priority: Minor > Fix For: 2.0.7 > > > {code} > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,013 > MigrationManager.java (line 329) Starting local schema reset... > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,016 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@394448776(114/1140 serialized/live bytes, 3 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,016 Memtable.java (line 331) > Writing Memtable-local@394448776(114/1140 serialized/live bytes, 3 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,182 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-398-Data.db (145 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33159822) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,185 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@1087210140(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,185 Memtable.java (line 331) > Writing Memtable-local@1087210140(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,357 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-399-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33159959) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,361 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@768887091(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,361 Memtable.java (line 331) > Writing Memtable-local@768887091(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,516 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-400-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33160096) > INFO [CompactionExecutor:38] 2014-03-12 11:37:54,517 CompactionTask.java > (line 115) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-398-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-400-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-399-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-397-Data.db')] > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,519 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@271993477(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,519 Memtable.java (line 331) > Writing Memtable-local@271993477(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,794 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-401-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33160233) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,799 > MigrationManager.java (line 357) Local schema reset is complete. > INFO [CompactionExecutor:38] 2014-03-12 11:37:54,848 CompactionTask.java > (line 275) Compacted 4 sstables to > [/var/lib/cassandra/data/system/local/system-local-jb-402,]. 6,099 bytes to > 5,821 (~95% of original) in 330ms = 0.016822MB/s. 4 total partitions merged > to 1. Partition merge counts were {4:1, } > INFO [OptionalTasks:1] 2014-03-12 11:37:55,110 ColumnFamilyStore.java (line > 785) Enqueuing flush of > Memtable-schema_columnfamilies@106276050(181506/509164 serialized/live bytes, > 3276 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:55,110 Memtable.java (line 331) > Writing Memtable-schema_columnfamilies@106276050(181506/509164 > serialized/live bytes, 3276 ops) > INFO [OptionalTasks:1] 2014-03-12
[jira] [Commented] (CASSANDRA-6841) ConcurrentModificationException in commit-log-writer after local schema reset
[ https://issues.apache.org/jira/browse/CASSANDRA-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933193#comment-13933193 ] Carl Yeksigian commented on CASSANDRA-6841: --- +1 > ConcurrentModificationException in commit-log-writer after local schema reset > - > > Key: CASSANDRA-6841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6841 > Project: Cassandra > Issue Type: Bug > Environment: Linux 3.2.0 (Debian Wheezy) Cassandra 2.0.6, Oracle JVM > 1.7.0_51 > Almost default cassandra.yaml (IPs and cluster name changed) > This is the 2nd node in a 2-node ring. It has ~2500 keyspaces and very low > traffic. (Only new keyspaces see reads and writes.) >Reporter: Pas >Assignee: Benedict >Priority: Minor > Fix For: 2.0.7 > > > {code} > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,013 > MigrationManager.java (line 329) Starting local schema reset... > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,016 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@394448776(114/1140 serialized/live bytes, 3 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,016 Memtable.java (line 331) > Writing Memtable-local@394448776(114/1140 serialized/live bytes, 3 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,182 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-398-Data.db (145 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33159822) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,185 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@1087210140(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,185 Memtable.java (line 331) > Writing Memtable-local@1087210140(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,357 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-399-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33159959) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,361 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@768887091(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,361 Memtable.java (line 331) > Writing Memtable-local@768887091(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,516 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-400-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33160096) > INFO [CompactionExecutor:38] 2014-03-12 11:37:54,517 CompactionTask.java > (line 115) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-398-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-400-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-399-Data.db'), > > SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-jb-397-Data.db')] > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,519 > ColumnFamilyStore.java (line 785) Enqueuing flush of > Memtable-local@271993477(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,519 Memtable.java (line 331) > Writing Memtable-local@271993477(62/620 serialized/live bytes, 1 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:54,794 Memtable.java (line 371) > Completed flushing > /var/lib/cassandra/data/system/local/system-local-jb-401-Data.db (96 bytes) > for commitlog position ReplayPosition(segmentId=1394620057452, > position=33160233) > INFO [RMI TCP Connection(38)-192.168.36.171] 2014-03-12 11:37:54,799 > MigrationManager.java (line 357) Local schema reset is complete. > INFO [CompactionExecutor:38] 2014-03-12 11:37:54,848 CompactionTask.java > (line 275) Compacted 4 sstables to > [/var/lib/cassandra/data/system/local/system-local-jb-402,]. 6,099 bytes to > 5,821 (~95% of original) in 330ms = 0.016822MB/s. 4 total partitions merged > to 1. Partition merge counts were {4:1, } > INFO [OptionalTasks:1] 2014-03-12 11:37:55,110 ColumnFamilyStore.java (line > 785) Enqueuing flush of > Memtable-schema_columnfamilies@106276050(181506/509164 serialized/live bytes, > 3276 ops) > INFO [FlushWriter:6] 2014-03-12 11:37:55,110 Memtable.java (line 331) > Writing Memtable-schema_columnfamilies@106276050(181506/509164 > serialized/live bytes, 3276 ops) > INFO [OptionalTasks:1] 2014-03-12 11:37:55,110 ColumnFamilyStore.java (line > 785) Enqueuing flush of Memtable-
[jira] [Resolved] (CASSANDRA-6850) cqlsh won't startup
[ https://issues.apache.org/jira/browse/CASSANDRA-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-6850. --- Resolution: Invalid > cqlsh won't startup > --- > > Key: CASSANDRA-6850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6850 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chander S Pechetty >Priority: Minor > Labels: cqlsh > > Scales library is missing > {noformat} > ../lib/cassandra-driver-internal-only-1.0.2.post.zip/cassandra-driver-1.0.2.post/cassandra/metrics.py", > line 4, in > ImportError: No module named greplin > {noformat} > It would be useful if users don't have to install scales explicitly for the > functioning of cqlsh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6850) cqlsh won't startup
[ https://issues.apache.org/jira/browse/CASSANDRA-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933203#comment-13933203 ] Jonathan Ellis commented on CASSANDRA-6850: --- [~thobbs] is this something we could reasonably stub out in the Python driver? > cqlsh won't startup > --- > > Key: CASSANDRA-6850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6850 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chander S Pechetty >Priority: Minor > Labels: cqlsh > > Scales library is missing > {noformat} > ../lib/cassandra-driver-internal-only-1.0.2.post.zip/cassandra-driver-1.0.2.post/cassandra/metrics.py", > line 4, in > ImportError: No module named greplin > {noformat} > It would be useful if users don't have to install scales explicitly for the > functioning of cqlsh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933209#comment-13933209 ] Jonathan Ellis commented on CASSANDRA-6587: --- Can you attach examples of slow and fast traces? > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > index 44a1e64..0228c3a 100644 > --- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > +++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > @@ -1123,8 +1123,10 @@ public
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933214#comment-13933214 ] Jan Chochol commented on CASSANDRA-6587: My guess about different results - if you use {{LIMIT}} with low value, than you can hit problem described in CASSANDRA-6348. In this case {{nodetool flush}} will make your queries slow again (problem is not optimal heuristic depending on some statistics of SSTable, which are not present before first SSTables is created). > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933223#comment-13933223 ] Otto Chrons commented on CASSANDRA-6587: Might be also related to vnodes (3 node cluster, vnodes=256). The same problem does not surface on a single node version. While the index is being created, a query to fetch data based on the secondary index is fast, but after the index is complete it gets really slow. Like 5 minutes to query 8700 rows from a set of 107k rows. Reading all the 107k rows (not using secondary index) takes only a dozen seconds, so something is seriously wrong here. And this was without {{LIMIT}} or any token-functions. The dataset is only about 9MB and the index is about 4.7MB, so this is not an I/O issue in anyway. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a,
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933227#comment-13933227 ] Ondřej Černoš commented on CASSANDRA-6587: -- I replayed the example Jan provided above with tracing on. This is the result: {noformat} cqlsh> SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 1000; a | d + a14521 | d14521 Tracing session: 4c221710-aab2-11e3-80fa-5f25e090b27a activity | timestamp| source | source_elapsed --+--+---+ execute_cql3_query | 14:20:50,947 | 127.0.0.1 | 0 Parsing SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 1000; | 14:20:50,947 | 127.0.0.1 | 1377 Preparing statement | 14:20:50,948 | 127.0.0.1 | 1943 Determining replicas to query | 14:20:50,948 | 127.0.0.1 | 2169 Candidate index mean cardinalities are org.apache.cassandra.db.index.composites.CompositesIndex@1527e0c3:13470. Scanning with t.t_b. | 14:20:50,949 | 127.0.0.1 | 2723 Executing indexed scan for (min(9221619202503120164), max(9221619202503120164)] | 14:20:50,949 | 127.0.0.1 | 2858 Candidate index mean cardinalities are org.apache.cassandra.db.index.composites.CompositesIndex@1527e0c3:13470. Scanning with t.t_b. | 14:20:50,949 | 127.0.0.1 | 2920 Executing single-partition query on t.t_b | 14:20:50,949 | 127.0.0.1 | 2970 Acquiring sstable references | 14:20:50,949 | 127.0.0.1 | 2991 Merging memtable tombstones | 14:20:50,949 | 127.0.0.1 | 3009 Key cache hit for sstable 1 | 14:20:50,949 | 127.0.0.1 | 3055 Seeking to partition beginning in data file | 14:20:50,949 | 127.0.0.1 | 3087 Merging data from memtables and 1 sstables | 14:20:50,950 | 127.0.0.1 | 3532 Read 743 live and 0 tombstoned cells | 14:20:50,951 | 127.0.0.1 | 5506 Executing single-partition query on t.t_b | 14:20:50,952 | 127.0.0.1 | 6463 Acquiring sstable references | 14:20:50,952 | 127.0.0.1 | 6475 Merging memtable tombstones | 14:20:50,952 | 127.0.0.1 | 6490 Key cache hit for sstable 1 | 14:20:50,953 | 127.0.0.1 | 6534 Seeking to partition indexed section in data file | 14:20:50,953 | 127.0.0.1 | 6570 Merging data from memtables and 1 sstables | 14:20:50,953 | 127.0.0.1 | 6598 Read 743 live and 0 tombstoned cells | 14:20:50,955 | 127.0.0.1 | 9511 Executing single-partition query on t.t_b | 14:20:50,956 | 127.0.0.1 | 10388 Acquiring sstable ref
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933290#comment-13933290 ] Jonathan Ellis commented on CASSANDRA-6587: --- To Jan's original description: bq. Index row contains parition keys in partion key ordering (ordering exposed in CQL3 as TOKEN(partition_key)) Correct. bq. so these two request are expected to return same values No, that doesn't follow at all. a > b does not imply token(a) > token(b) or vice versa. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java/org/apache/cass
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933294#comment-13933294 ] Otto Chrons commented on CASSANDRA-6587: The potential values for the indexed field are COMPLETED, SUPERVISED and IN_PROGRESS where most of the rows are in the COMPLETED category. The query for SUPERVISED rows contains (currently) about 8% of all data. I will do some additional debugging and tracing once I get the stuff transferred to a new "non-vnodes" cluster. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933296#comment-13933296 ] Gary Dusbabek commented on CASSANDRA-6846: -- +1 I support this. I have believed for a long time that components of Cassandra could be extracted to create an excellent platform for building distributed systems in general. This kind of thinking gets us pointed in that direction. Even though it is designed to be a database, I would support the standardization of a few internal APIs. Practically speaking this will be hard, and I would expect a bumpy road, as the API would occasionally be trampled by new database features. A few good tests in the projects that rely on these APIs is all that's needed to monitor things. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (CASSANDRA-6852) can't repair -pr part of data when not replicating data everywhere (multiDCs)
[ https://issues.apache.org/jira/browse/CASSANDRA-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita resolved CASSANDRA-6852. --- Resolution: Duplicate See CASSANDRA-5424 > can't repair -pr part of data when not replicating data everywhere (multiDCs) > - > > Key: CASSANDRA-6852 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6852 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Cyril Scetbon > Labels: multi-dcs, ranges, repair > > Our environment is as follows : > - 3 DCS : dc1,dc2 and dc3 > - replicate all keyspaces to dc1 and dc2 > - replicate a few keyspaces to dc3 as we have less hardware and use it for > computing statistics > We use repair -pr everywhere regularly. FYI, a full repair takes almost 20 > hours per node. The matter is that we can't use "repair -pr" anymore for > tokens stored on dc3 concerning keyspaces not replicated. We should have a > way to repair those ranges without doing a FULL REPAIR everywhere -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933298#comment-13933298 ] Jan Chochol commented on CASSANDRA-6587: {quote} No, that doesn't follow at all. a > b does not imply token(a) > token(b) or vice versa. {quote} Yes you are right. For inequality this returns different results. Fortunately that is not problem for our use case - we need this for paging, and we do not care about items position in pages (as soon as all items are returned). So we do not care about ordering - we want just any ordering. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/or
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933300#comment-13933300 ] Edward Capriolo commented on CASSANDRA-6846: The current prospective seems to be that ALL the code of cassandra is made to serve only the CQL language. Everything is "INTERNAL". We should close this issue no one is going to allow/support anything like this. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933285#comment-13933285 ] Jonathan Ellis commented on CASSANDRA-6587: --- bq. While the index is being created, a query to fetch data based on the secondary index is fast, but after the index is complete it gets really slow This sounds to me like you have a low-cardinality dataset and you shouldn't be using the index at all. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > index
[jira] [Resolved] (CASSANDRA-6587) Slow query when using token range and secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-6587. --- Resolution: Duplicate Created CASSANDRA-6853 with the actionable part of this ticket more clearly expressed. > Slow query when using token range and secondary index > - > > Key: CASSANDRA-6587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6587 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jan Chochol > > We are using token ranges to simulate pagination on external API. To achieve > this, we use similar queries: > {noformat} > SELECT * FROM table WHERE TOKEN(partition_key) > TOKEN('offset') AND > secondary_key = 'value' LIMIT 1000; > {noformat} > We found that such statement is quite ineffective, and we do not know how to > solve it. > Let's try some example. > You can fill Cassandra with folowing script: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = > {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE > t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b > ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max > = 10; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k > = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', > 'b\$j', 'c\$k', 'd\$i');\n\")}; for(\$i = 0; \$i < \$max; \$i++) { > print(\"INSERT INTO t (a, b, c, d) VALUES ('e\$i', 'f\$j', 'g\$k', > 'h\$i');\n\")}" | cqlsh > {noformat} > First we looked for last but one parition key: > {noformat} > [root@jch3-devel:~/c4] echo "SELECT a FROM t.t WHERE b = 'b1' LIMIT 10;" > | cqlsh | tail > a18283 > a11336 > a14712 > a11476 > a19396 > a14269 > a10719 > a14521 > a13934 > {noformat} > Than we issue following commands for some interesting behaviour: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT > 1000; > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) = TOKEN('a14521') LIMIT 10; > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' LIMIT 10; > {noformat} > And here is result: > {noformat} > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 1000;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.647s > user0m0.307s > sys 0m0.076s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND > TOKEN(a) = TOKEN('a14521') LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m16.454s > user0m0.341s > sys 0m0.090s > [root@jch3-devel:~/c4] time echo "SELECT a, d FROM t.t WHERE b = 'b1' AND a = > 'a14521' LIMIT 10;" | cqlsh > a | d > + > a14521 | d14521 > real0m0.404s > user0m0.309s > sys 0m0.071s > {noformat} > Problem with {{LIMIT}} is described in CASSANDRA-6348, and is quite funny - > lower the limit, slower the requst (and with different structure of data it > can be even worse). > This query is quite silly in reality (asking with secondary key, when you > have primary key), but is close as possible to our use case: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 10; > {noformat} > But we simply can not do: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 10; > {noformat} > As this is unsupported. > {{CompositesSearcher.java}} gives us some clue about the problem: > {noformat} > /* > * XXX: If the range requested is a token range, we'll have to start > at the beginning (and stop at the end) of > * the indexed row unfortunately (which will be inefficient), because > we have not way to intuit the small > * possible key having a given token. A fix would be to actually > store the token along the key in the > * indexed row. > */ > {noformat} > Index row contains parition keys in partion key ordering (ordering exposed in > CQL3 as {{TOKEN(partition_key)}}), so these two request are expected to > return same values: > {noformat} > SELECT a, d FROM t.t WHERE b = 'b1' AND TOKEN(a) > TOKEN('a14521') LIMIT 1; > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' LIMIT 1; > {noformat} > But the second is not supported. > Currently we are considering to go to our production with this patch: > {noformat} > diff --git > a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > index 44a1e64..0228c3a 100644 > --- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > +++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java > @@ -1123,8 +1123,10
[jira] [Updated] (CASSANDRA-6853) Allow filtering on primary key expressions in 2i queries
[ https://issues.apache.org/jira/browse/CASSANDRA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6853: -- Labels: indexes (was: ) > Allow filtering on primary key expressions in 2i queries > > > Key: CASSANDRA-6853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6853 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis >Assignee: Sylvain Lebresne >Priority: Minor > Labels: indexes > Fix For: 3.0 > > > We allow > {code} > SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' > {code} > and > {code} > SELECT a, d FROM t.t WHERE b = 'b1' AND token(a) > token( 'a14521') > {code} > but not > {code} > SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' > {code} > (given an index on {{b}}, with primary key {{a}}) > we allow combining other predicates with an indexed one and filtering those > in a nested loop; we should allow the same for primary keys -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-6853) Allow filtering on primary key expressions in 2i queries
Jonathan Ellis created CASSANDRA-6853: - Summary: Allow filtering on primary key expressions in 2i queries Key: CASSANDRA-6853 URL: https://issues.apache.org/jira/browse/CASSANDRA-6853 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Minor Fix For: 3.0 We allow {code} SELECT a, d FROM t.t WHERE b = 'b1' AND a = 'a14521' {code} and {code} SELECT a, d FROM t.t WHERE b = 'b1' AND token(a) > token( 'a14521') {code} but not {code} SELECT a, d FROM t.t WHERE b = 'b1' AND a > 'a14521' {code} (given an index on {{b}}, with primary key {{a}}) we allow combining other predicates with an indexed one and filtering those in a nested loop; we should allow the same for primary keys -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6659) Allow "intercepting" query by user provided custom classes
[ https://issues.apache.org/jira/browse/CASSANDRA-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6659: -- Reviewer: Sam Tunnicliffe (was: Benjamin Coverston) > Allow "intercepting" query by user provided custom classes > -- > > Key: CASSANDRA-6659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6659 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Minor > Attachments: 6659.txt > > > The idea for this ticket is to abstract the main execution methods of > QueryProcessor into an interface, something like: > {noformat} > public interface QueryHandler > { > public ResultSet process(String query, QueryState state, QueryOptions > options); > public ResultMessage.Prepared prepare(String query, QueryState state); > public ResultSet processPrepared(CQLStatement statement, QueryState > state, QueryOptions options); > public ResultSet processBatch(BatchStatement statement, QueryState state, > BatchQueryOptions options); > } > {noformat} > and to allow users to provide a specific class of their own (implementing > said interface) to which the native protocol would handoff queries to (so by > default queries would go to QueryProcessor, but you would have a way to use a > custom class instead). > A typical use case for that could be to allow some form of custom logging of > incoming queries and/or of their results. But this could probably also have > some application for testing as one could have a handler that completely > bypass QueryProcessor if you want, say, do perf regression tests for a given > driver (and don't want to actually execute the query as you're perf testing > the driver, not C*) without needing to patch the sources. Those being just > examples, the mechanism is generic enough to allow for other ideas. > Most importantly, it requires very little code in C*. As for how users would > register their "handler", it can be as simple as a startup flag indicating > the class to use, or a yaml setting, or both. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933319#comment-13933319 ] Russell Bradberry commented on CASSANDRA-6846: -- [~gdusbabek] I completely ageree, Riak did something similar by separating the core distribution from the storage layer allowing people to use components of Riak to build a distributed system of their own. I'm not saying this is the right path for C* but modularizability make everything a little easier, it would also open the door for more awesome features in DSE, IMO. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933366#comment-13933366 ] Edward Capriolo commented on CASSANDRA-6846: [~devdazed] DSE already does this. (Well at leas this is how brisk worked). Brisk put SOLR and Hadoop inside cassandra. Yet at the same time the project is saying that CQL is the only use case that will be supported. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933380#comment-13933380 ] Russell Bradberry commented on CASSANDRA-6846: -- [~appodictic] What I'm saying is that if the project were componetized then there wouldn't be a need to put it INSIDE Cassandra. It would just connect using the available interface. Want to add SOLR? Just drop a JAR, want to add Hadoop? Drop a JAR. etc. etc. Rather than having a custom built completely different project. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933404#comment-13933404 ] Tupshin Harper commented on CASSANDRA-6846: --- You are correct that there is substantial overlap between the needs driving this ticket, and the need that has driven some users to embed Cassandra in a larger app. I'm currently neutral on whether it be approached from that angle, or from the original angle of this ticket (or both). But there is a lot of equivalency between the two. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: (╯°□°)╯︵┻━┻, ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933432#comment-13933432 ] Jonathan Ellis edited comment on CASSANDRA-6846 at 3/13/14 3:48 PM: bq. DSE already does this. (Well at leas this is how brisk worked). Brisk embedded the Hadoop jobtracker and tasktracker for a bunch of reasons that ended up not being worth the trouble. (I'm pretty sure the latest DSE moved it back out of process.) But all the actual functionality was done with the public API: http://github.com/riptano/brisk was (Author: jbellis): bq. DSE already does this. (Well at leas this is how brisk worked). Brisk embedded the Hadoop jobtracker for a bunch of reasons that ended up not being worth the trouble. (I'm pretty sure the latest DSE moved it back out of process.) But all the actual functionality was done with the public API: http://github.com/riptano/brisk > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933432#comment-13933432 ] Jonathan Ellis commented on CASSANDRA-6846: --- bq. DSE already does this. (Well at leas this is how brisk worked). Brisk embedded the Hadoop jobtracker for a bunch of reasons that ended up not being worth the trouble. (I'm pretty sure the latest DSE moved it back out of process.) But all the actual functionality was done with the public API: http://github.com/riptano/brisk > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6833) Add json data type
[ https://issues.apache.org/jira/browse/CASSANDRA-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6833: -- Fix Version/s: (was: 2.0.7) Labels: ponies (was: ) Issue Type: New Feature (was: Bug) > Add json data type > -- > > Key: CASSANDRA-6833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6833 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Priority: Minor > Labels: ponies > > While recognizing that UDT (CASSANDRA-5590) is the Right Way to store > hierarchical data in C*, it can still be useful to store json blobs as text. > Adding a json type would allow validating that data. (And adding formatting > support in cqlsh?) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933433#comment-13933433 ] Jeremiah Jordan commented on CASSANDRA-6846: Are we building the fastest performant database we can or a development platform? You can't have both. As to be a development platform you must sacrifice many of the optimizations that would be possible otherwise, but would break the "development platform" interface. Riak went the development platform way, which I think was to their detriment as a database. Having pluggable everything stops them from being able to make assumptions about how things work in another layer. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933463#comment-13933463 ] Nate McCall commented on CASSANDRA-6846: [~jjordan] I don't think 'pluggable everything' is the right approach either. [~gdusbabek]'s "standardization of a few internal APIs" feels like the right idea here. Facilitating greater hackability does not have to have a negative effect on performance. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6846: -- Labels: ponies (was: (╯°□°)╯︵┻━┻ ponies) > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933457#comment-13933457 ] Edward Capriolo commented on CASSANDRA-6846: The dev discussion that prompted this ticket was clear in indicating that the committers do not want to undertake the support burden of supporting multiple APIs, t the burden of other transports, or other turing complete plug-ability ideas. I agree with them. If the burden of supporting an already working thrift is deemed not worth it, it is hard to justify the burden of supporting deep server integration. If the server only officially supports one true API, this second interface will end up just like thrift, a second class citizen that no one is interested in supporting. What is the incentive for a developer to expose a very complicated piece of logic to some plugable interface? None. Just like there was no incentive to expose features to thrift. "If any new use cases come to light that can be done with Thrift but not CQL, we will commit to supporting those in CQL." All we are attempting to do with this ticket is attempting to replace "Thrift" with "Deep Application Server Integeration". DSE is a very successful fork. This work belongs outside the project. As do many things. I will open up a fork in a few days and share the details. I think the goal of the fork will be "say yes". You want to add a memcache compatible API into cassandra? YES! You want a rest interface ? YES! You want to make the CQL interface accept CLI commands? YES! You want new thrift methods? Yes You want to bring brisk back to life? YES!!! These things will likely never be wanted by the upstream project. That is fine so be it. Better we move forward separately. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933467#comment-13933467 ] Peter commented on CASSANDRA-6846: -- I want Cassandra to be a performant database first. My desire is for easier and more friendly query capabilities. I want to be able to queries like select col1, col2, col3 from MyColFamily where col1='blah1' or col2='blah2' select * from MyColFamily2 where col3 IN ('blah4,',blah5','blah6') group by col2 select * from MyColFamily3 where col5 LIKE '%blah' and col8 LIKE 'blah2%' I don't know which approach is the right way to go, but I do feel discussion can help flush out pros/cons. At first I thought about doing all of this on the client side by writing a SQL parser, query planner and query optimizer, but it would need to be done for every single language. When I ported Hector to C#, I thought about implementing a full LINQ implementation so that uses can do more complex queries, but then I have two completely different code bases for doing similar things. Having "some" enhancement on the server can reduce the amount of work for driver implementators. Another approach would be to use Solr libraries to make a custom index, which I'm sure others have thought about doing. Then there's approaches like spark/shark, presto, etc. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6852) can't repair -pr part of data when not replicating data everywhere (multiDCs)
[ https://issues.apache.org/jira/browse/CASSANDRA-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933466#comment-13933466 ] Cyril Scetbon commented on CASSANDRA-6852: -- I don't think the original JIRA can really help me :( I understand that you chose to not repair -pr on a node not storing data for one keyspace. However here is my problem or show me I'm wrong: http://pastebin.com/VrDnWvAu So the incriminated node owns 10.1% of the tokens which seem to be responsible for rows in keyspace ks1. However as ks1 is not replicated to that node, how can I repair it without a FULL REPAIR on all nodes ? At the beginning I thought like [~jjordan] when I read his [comment|https://issues.apache.org/jira/browse/CASSANDRA-5424?focusedCommentId=13673260&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13673260]. However I can see that tokens on the incriminated node (not replicating data for ks1) is referring rows ... I suppose it's what we call primary ranges and if not why do I find ks1's data for those tokens ? > can't repair -pr part of data when not replicating data everywhere (multiDCs) > - > > Key: CASSANDRA-6852 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6852 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Cyril Scetbon > Labels: multi-dcs, ranges, repair > > Our environment is as follows : > - 3 DCS : dc1,dc2 and dc3 > - replicate all keyspaces to dc1 and dc2 > - replicate a few keyspaces to dc3 as we have less hardware and use it for > computing statistics > We use repair -pr everywhere regularly. FYI, a full repair takes almost 20 > hours per node. The matter is that we can't use "repair -pr" anymore for > tokens stored on dc3 concerning keyspaces not replicated. We should have a > way to repair those ranges without doing a FULL REPAIR everywhere -- This message was sent by Atlassian JIRA (v6.2#6252)
[6/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e3443145 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e3443145 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e3443145 Branch: refs/heads/cassandra-2.1 Commit: e3443145c7126a1933d24246efa7ff17accd542e Parents: 8a52f5a 92b6d7a Author: Jonathan Ellis Authored: Thu Mar 13 12:01:13 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:01:13 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e3443145/CHANGES.txt --
[4/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/92b6d7a3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/92b6d7a3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/92b6d7a3 Branch: refs/heads/cassandra-2.1 Commit: 92b6d7a3900dbaf97a8ae7283aecd23aa1f2a25e Parents: faffbf2 5030d42 Author: Jonathan Ellis Authored: Thu Mar 13 12:00:56 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:00:56 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/92b6d7a3/CHANGES.txt --
[2/6] git commit: add #2635 to CHANGES
add #2635 to CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5030d421 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5030d421 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5030d421 Branch: refs/heads/cassandra-2.0 Commit: 5030d4215d9d8c380bf48c25fddee7c863599382 Parents: 91d220b Author: Jonathan Ellis Authored: Thu Mar 13 12:00:47 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:00:47 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5030d421/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b3c0a35..a68e8ac 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -889,6 +889,7 @@ Merged from 1.0: 1.1.1 + * add populate_io_cache_on_flush option (CASSANDRA-2635) * allow larger cache capacities than 2GB (CASSANDRA-4150) * add getsstables command to nodetool (CASSANDRA-4199) * apply parent CF compaction settings to secondary index CFs (CASSANDRA-4280)
[1/6] git commit: add #2635 to CHANGES
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 91d220b35 -> 5030d4215 refs/heads/cassandra-2.0 faffbf2fa -> 92b6d7a39 refs/heads/cassandra-2.1 8a52f5af4 -> e3443145c add #2635 to CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5030d421 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5030d421 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5030d421 Branch: refs/heads/cassandra-1.2 Commit: 5030d4215d9d8c380bf48c25fddee7c863599382 Parents: 91d220b Author: Jonathan Ellis Authored: Thu Mar 13 12:00:47 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:00:47 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5030d421/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b3c0a35..a68e8ac 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -889,6 +889,7 @@ Merged from 1.0: 1.1.1 + * add populate_io_cache_on_flush option (CASSANDRA-2635) * allow larger cache capacities than 2GB (CASSANDRA-4150) * add getsstables command to nodetool (CASSANDRA-4199) * apply parent CF compaction settings to secondary index CFs (CASSANDRA-4280)
[3/6] git commit: add #2635 to CHANGES
add #2635 to CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5030d421 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5030d421 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5030d421 Branch: refs/heads/cassandra-2.1 Commit: 5030d4215d9d8c380bf48c25fddee7c863599382 Parents: 91d220b Author: Jonathan Ellis Authored: Thu Mar 13 12:00:47 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:00:47 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5030d421/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b3c0a35..a68e8ac 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -889,6 +889,7 @@ Merged from 1.0: 1.1.1 + * add populate_io_cache_on_flush option (CASSANDRA-2635) * allow larger cache capacities than 2GB (CASSANDRA-4150) * add getsstables command to nodetool (CASSANDRA-4199) * apply parent CF compaction settings to secondary index CFs (CASSANDRA-4280)
[5/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/92b6d7a3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/92b6d7a3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/92b6d7a3 Branch: refs/heads/cassandra-2.0 Commit: 92b6d7a3900dbaf97a8ae7283aecd23aa1f2a25e Parents: faffbf2 5030d42 Author: Jonathan Ellis Authored: Thu Mar 13 12:00:56 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 12:00:56 2014 -0500 -- CHANGES.txt | 1 + 1 file changed, 1 insertion(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/92b6d7a3/CHANGES.txt --
[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed
[ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933539#comment-13933539 ] Jonathan Ellis commented on CASSANDRA-6746: --- CASSANDRA-2635 > Reads have a slow ramp up in speed > -- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ryan McGuire >Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, > cassandra-2.0-bdplab-trial-fincore.tar.bz2, > cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. > The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data > here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Minor tweaks for 'caching' option
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 e3443145c -> fab550989 Minor tweaks for 'caching' option Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fab55098 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fab55098 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fab55098 Branch: refs/heads/cassandra-2.1 Commit: fab5509898152072ba1f6ecb3d208e2bbe234495 Parents: e344314 Author: Mikhail Stepura Authored: Wed Mar 12 19:56:28 2014 -0700 Committer: Mikhail Stepura Committed: Thu Mar 13 10:06:56 2014 -0700 -- pylib/cqlshlib/cql3handling.py | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab55098/pylib/cqlshlib/cql3handling.py -- diff --git a/pylib/cqlshlib/cql3handling.py b/pylib/cqlshlib/cql3handling.py index ae03cde..8e9f987 100644 --- a/pylib/cqlshlib/cql3handling.py +++ b/pylib/cqlshlib/cql3handling.py @@ -419,6 +419,8 @@ def cf_prop_val_completer(ctxt, cass): return ["{'sstable_compression': '"] if this_opt == 'compaction': return ["{'class': '"] +if this_opt == 'caching': +return ["{'keys': '"] if any(this_opt == opt[0] for opt in CqlRuleSet.obsolete_cf_options): return ["''"] if this_opt in ('read_repair_chance', 'bloom_filter_fp_chance', @@ -472,9 +474,9 @@ def cf_prop_val_mapval_completer(ctxt, cass): return [Hint('')] elif opt == 'caching': if key == 'rows_per_partition': -return [Hint('ALL'), Hint('NONE'), Hint('#rows_per_partition')] +return ["'ALL'", "'NONE'", Hint('#rows_per_partition')] elif key == 'keys': -return [Hint('ALL'), Hint('NONE')] +return ["'ALL'", "'NONE'"] return () def cf_prop_val_mapender_completer(ctxt, cass):
[jira] [Commented] (CASSANDRA-4771) Setting TTL to Integer.MAX causes columns to not be persisted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933572#comment-13933572 ] Randy Fradin commented on CASSANDRA-4771: - Isn't this going to break again in less than 4 years when (now + 20 years) > 2038-01-19? > Setting TTL to Integer.MAX causes columns to not be persisted. > -- > > Key: CASSANDRA-4771 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4771 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.0.12 >Reporter: Todd Nine >Assignee: Dave Brosius > Fix For: 1.1.6 > > Attachments: 4771.txt, 4771_b.txt > > > When inserting columns via batch mutation, we have an edge case where columns > will be set to Integer.MAX. When setting the column expiration time to > Integer.MAX, the columns do not appear to be persisted. > Fails: > Integer.MAX_VALUE > Integer.MAX_VALUE/2 > Works: > Integer.MAX_VALUE/3 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933619#comment-13933619 ] Tupshin Harper edited comment on CASSANDRA-6846 at 3/13/14 5:49 PM: Consider this comment to be an RFC flipping this ticket on its head, and making it into a meta-ticket for the following: 1) Instead of Cassandra turning into the container, focus on making it easier to embed Cassandra in your own custom application or container. This would become a separate independent JIRA, and could be figured out on its own. 2) Focus additional community efforts on figuring out a good and easy to implement triggers interface to be used as the primary programmatic hook at ingestion time. Another existing or new ticket would be used for that. 3) Focus additional community efforts on defining a proper UDF (use defined function) interface using hive-style UDTFs as primary source of inspiration. This would be the primary programmatic hook for processing results, and could also be used in conjunction with triggers for inbound processing. There are already a few tickets around this area, and we would consolidate on one or create a new one. 4) Define a set of semi-frozen medium level apis (along the lines of what Nate suggested above) that would be stable for different Zs in X.Y.Z releases, but free to change in any larger release. The devil is still in the details of this, but this, too, would be an independent ticket subject to its own debate. As a result, we would be constantly focusing on CQL as the primary (and really only) consumable API coming out of Cassandra (as both triggers and UDFs would be extending CQL). If, however, you needed to do more, that was not possible with just the triggers and UDF interfaces, you would have to instead embed Cassandra and would now be responsible for the life-cycle. With additional power (access to that medium level api), you have incurred the responsibility of being the container, setting up the environment, and managing the overall run-time. That's a serious product design question that should not be taken lightly, but would make sense for a small handful. I would love to hear if any of the above is a non-starter for any reason, or, if it were fully realized, if it would be insufficient for anybody's needs. was (Author: tupshin): Consider this comment to be an RFC flipping this ticket on its head, and making it into a meta-ticket for the following: 1) Instead of Cassandra turning into the container, focus on making it easier to embed Cassandra in your own custom application or container. This would become a separate independent JIRA, and could be figured out on its own. 2) Focus additional community efforts on figuring out a good and easy to implement triggers interface to be used as the primary programmatic hook at ingestion time. Another existing or new ticket would be used for that. 3) Focus additional community efforts on defining a proper UDF (use defined function) interface using hive-style UDTFs as primary source of inspiration. This would be the primary programmatic hook for processing results, and could also be used in conjunction with triggers for inbound processing. There are already a few tickets around this area, and we would consolidate on one or create a new one. 4) Define a set of semi-frozen medium level apis (along the lines of what Nate suggested above) that would be stable for different Zs in X.Y.Z releases, but free to change in any larger release. The devil is still in the details of this, but this, too, would be an independent ticket subject to its own debate. As a result, we would be constantly focusing on CQL as the primary (and really only) consumable API coming out of Cassandra (as both triggers and UTFs would be extending CQL). If, however, you needed to do more, that was not possible with just the triggers and UDF interfaces, you would have to instead embed Cassandra and would now be responsible for the life-cycle. With additional power (access to that medium level api), you have incurred the responsibility of being the container, setting up the environment, and managing the overall run-time. That's a serious product design question that should not be taken lightly, but would make sense for a small handful. I would love to hear if any of the above is a non-starter for any reason, or, if it were fully realized, if it would be insufficient for anybody's needs. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933619#comment-13933619 ] Tupshin Harper commented on CASSANDRA-6846: --- Consider this comment to be an RFC flipping this ticket on its head, and making it into a meta-ticket for the following: 1) Instead of Cassandra turning into the container, focus on making it easier to embed Cassandra in your own custom application or container. This would become a separate independent JIRA, and could be figured out on its own. 2) Focus additional community efforts on figuring out a good and easy to implement triggers interface to be used as the primary programmatic hook at ingestion time. Another existing or new ticket would be used for that. 3) Focus additional community efforts on defining a proper UDF (use defined function) interface using hive-style UDTFs as primary source of inspiration. This would be the primary programmatic hook for processing results, and could also be used in conjunction with triggers for inbound processing. There are already a few tickets around this area, and we would consolidate on one or create a new one. 4) Define a set of semi-frozen medium level apis (along the lines of what Nate suggested above) that would be stable for different Zs in X.Y.Z releases, but free to change in any larger release. The devil is still in the details of this, but this, too, would be an independent ticket subject to its own debate. As a result, we would be constantly focusing on CQL as the primary (and really only) consumable API coming out of Cassandra (as both triggers and UTFs would be extending CQL). If, however, you needed to do more, that was not possible with just the triggers and UDF interfaces, you would have to instead embed Cassandra and would now be responsible for the life-cycle. With additional power (access to that medium level api), you have incurred the responsibility of being the container, setting up the environment, and managing the overall run-time. That's a serious product design question that should not be taken lightly, but would make sense for a small handful. I would love to hear if any of the above is a non-starter for any reason, or, if it were fully realized, if it would be insufficient for anybody's needs. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (CASSANDRA-6850) cqlsh won't startup
[ https://issues.apache.org/jira/browse/CASSANDRA-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura reopened CASSANDRA-6850: Assignee: Mikhail Stepura I'll double check if I committed the properly packaged driver > cqlsh won't startup > --- > > Key: CASSANDRA-6850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6850 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chander S Pechetty >Assignee: Mikhail Stepura >Priority: Minor > Labels: cqlsh > > Scales library is missing > {noformat} > ../lib/cassandra-driver-internal-only-1.0.2.post.zip/cassandra-driver-1.0.2.post/cassandra/metrics.py", > line 4, in > ImportError: No module named greplin > {noformat} > It would be useful if users don't have to install scales explicitly for the > functioning of cqlsh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6850) cqlsh won't startup
[ https://issues.apache.org/jira/browse/CASSANDRA-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-6850: --- Reproduced In: 2.1 beta2 Since Version: 2.1 beta2 (was: 2.1) > cqlsh won't startup > --- > > Key: CASSANDRA-6850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6850 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chander S Pechetty >Assignee: Mikhail Stepura >Priority: Minor > Labels: cqlsh > > Scales library is missing > {noformat} > ../lib/cassandra-driver-internal-only-1.0.2.post.zip/cassandra-driver-1.0.2.post/cassandra/metrics.py", > line 4, in > ImportError: No module named greplin > {noformat} > It would be useful if users don't have to install scales explicitly for the > functioning of cqlsh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6788: -- Attachment: 6793-v3-rebased.txt > Race condition silently kills thrift server > --- > > Key: CASSANDRA-6788 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Rolf >Assignee: Christian Rolf > Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, > race_patch.diff > > > There's a race condition in CustomTThreadPoolServer that can cause the thrift > server to silently stop listening for connections. > It happens when the executor service throws a RejectedExecutionException, > which is not caught. > > Silent in the sense that OpsCenter doesn't notice any problem since JMX is > still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933641#comment-13933641 ] Jonathan Ellis commented on CASSANDRA-6788: --- (attaching my rebase of v3 for posterity) > Race condition silently kills thrift server > --- > > Key: CASSANDRA-6788 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Rolf >Assignee: Christian Rolf > Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, > race_patch.diff > > > There's a race condition in CustomTThreadPoolServer that can cause the thrift > server to silently stop listening for connections. > It happens when the executor service throws a RejectedExecutionException, > which is not caught. > > Silent in the sense that OpsCenter doesn't notice any problem since JMX is > still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6788) Race condition silently kills thrift server
[ https://issues.apache.org/jira/browse/CASSANDRA-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933635#comment-13933635 ] Jonathan Ellis commented on CASSANDRA-6788: --- Hmm, doesn't this mean we're back to dying ignominiously if we still happen to get a REE? I would prefer v2 for that reason. > Race condition silently kills thrift server > --- > > Key: CASSANDRA-6788 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6788 > Project: Cassandra > Issue Type: Bug >Reporter: Christian Rolf >Assignee: Christian Rolf > Attachments: 6788-v2.txt, 6788-v3.txt, 6793-v3-rebased.txt, > race_patch.diff > > > There's a race condition in CustomTThreadPoolServer that can cause the thrift > server to silently stop listening for connections. > It happens when the executor service throws a RejectedExecutionException, > which is not caught. > > Silent in the sense that OpsCenter doesn't notice any problem since JMX is > still running fine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[2/4] git commit: merge from 1.2
merge from 1.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3b31490 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3b31490 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3b31490 Branch: refs/heads/cassandra-2.1 Commit: a3b314908b5b7b666e7ce782ef4a1a91615847f1 Parents: 92b6d7a 52cf09d Author: Jonathan Ellis Authored: Thu Mar 13 13:05:11 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:05:11 2014 -0500 -- CHANGES.txt | 3 +++ src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3b31490/CHANGES.txt -- diff --cc CHANGES.txt index ce01e5c,325c623..2413b51 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,54 -1,9 +1,57 @@@ -1.2.16 +2.0.7 + * Fix saving triggers to schema (CASSANDRA-6789) + * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) + * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) + * Fix static counter columns (CASSANDRA-6827) + * Restore expiring->deleted (cell) compaction optimization (CASSANDRA-6844) + * Fix CompactionManager.needsCleanup (CASSANDRA-6845) + * Correctly compare BooleanType values other than 0 and 1 (CASSANDRA-6779) ++Merged from 1.2: + * fix nodetool getsstables for blob PK (CASSANDRA-6803) ++ + +2.0.6 + * Avoid race-prone second "scrub" of system keyspace (CASSANDRA-6797) + * Pool CqlRecordWriter clients by inetaddress rather than Range + (CASSANDRA-6665) + * Fix compaction_history timestamps (CASSANDRA-6784) + * Compare scores of full replica ordering in DES (CASSANDRA-6883) + * fix CME in SessionInfo updateProgress affecting netstats (CASSANDRA-6577) + * Allow repairing between specific replicas (CASSANDRA-6440) + * Allow per-dc enabling of hints (CASSANDRA-6157) + * Add compatibility for Hadoop 0.2.x (CASSANDRA-5201) + * Fix EstimatedHistogram races (CASSANDRA-6682) + * Failure detector correctly converts initial value to nanos (CASSANDRA-6658) + * Add nodetool taketoken to relocate vnodes (CASSANDRA-4445) + * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) + * Improve nodetool cfhistograms formatting (CASSANDRA-6360) + * Expose bulk loading progress over JMX (CASSANDRA-4757) + * Correctly handle null with IF conditions and TTL (CASSANDRA-6623) + * Account for range/row tombstones in tombstone drop + time histogram (CASSANDRA-6522) + * Stop CommitLogSegment.close() from calling sync() (CASSANDRA-6652) + * Make commitlog failure handling configurable (CASSANDRA-6364) + * Avoid overlaps in LCS (CASSANDRA-6688) + * Improve support for paginating over composites (CASSANDRA-4851) + * Fix count(*) queries in a mixed cluster (CASSANDRA-6707) + * Improve repair tasks(snapshot, differencing) concurrency (CASSANDRA-6566) + * Fix replaying pre-2.0 commit logs (CASSANDRA-6714) + * Add static columns to CQL3 (CASSANDRA-6561) + * Optimize single partition batch statements (CASSANDRA-6737) + * Disallow post-query re-ordering when paging (CASSANDRA-6722) + * Fix potential paging bug with deleted columns (CASSANDRA-6748) + * Fix NPE on BulkLoader caused by losing StreamEvent (CASSANDRA-6636) + * Fix truncating compression metadata (CASSANDRA-6791) + * Fix UPDATE updating PRIMARY KEY columns implicitly (CASSANDRA-6782) + * Fix IllegalArgumentException when updating from 1.2 with SuperColumns + (CASSANDRA-6733) + * FBUtilities.singleton() should use the CF comparator (CASSANDRA-6778) + * Fix CQLSStableWriter.addRow(Map) (CASSANDRA-6526) + * Fix HSHA server introducing corrupt data (CASSANDRA-6285) + * Fix CAS conditions for COMPACT STORAGE tables (CASSANDRA-6813) +Merged from 1.2: * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) - * Don't attempt cross-dc forwarding in mixed-version cluster with 1.1 - (CASSANDRA-6732) * Fix broken streams when replacing with same IP (CASSANDRA-6622) * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) * Fix partition and range deletes not triggering flush (CASSANDRA-6655) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3b31490/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[4/4] git commit: Merge remote-tracking branch 'origin/cassandra-2.1' into cassandra-2.1
Merge remote-tracking branch 'origin/cassandra-2.1' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c3429658 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c3429658 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c3429658 Branch: refs/heads/cassandra-2.1 Commit: c342965810302bf13ed4bbb2f2301c6358dfb4a4 Parents: 105dcdb fab5509 Author: Jonathan Ellis Authored: Thu Mar 13 13:06:28 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:06:28 2014 -0500 -- pylib/cqlshlib/cql3handling.py | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --
git commit: Python-driver without a hard dependency on `scales`
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 c34296581 -> a2ceb22c5 Python-driver without a hard dependency on `scales` Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2ceb22c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2ceb22c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2ceb22c Branch: refs/heads/cassandra-2.1 Commit: a2ceb22c59e9c2d1bb3717b2703a7cf181997d3d Parents: c342965 Author: Mikhail Stepura Authored: Thu Mar 13 11:06:23 2014 -0700 Committer: Mikhail Stepura Committed: Thu Mar 13 11:07:09 2014 -0700 -- ...assandra-driver-internal-only-1.0.2.post.zip | Bin 95836 -> 95846 bytes 1 file changed, 0 insertions(+), 0 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2ceb22c/lib/cassandra-driver-internal-only-1.0.2.post.zip -- diff --git a/lib/cassandra-driver-internal-only-1.0.2.post.zip b/lib/cassandra-driver-internal-only-1.0.2.post.zip index 9f6af56..7ccd5f7 100644 Binary files a/lib/cassandra-driver-internal-only-1.0.2.post.zip and b/lib/cassandra-driver-internal-only-1.0.2.post.zip differ
[1/3] git commit: fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 5030d4215 -> 52cf09dba refs/heads/cassandra-2.0 92b6d7a39 -> a3b314908 fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/52cf09db Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/52cf09db Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/52cf09db Branch: refs/heads/cassandra-1.2 Commit: 52cf09dbace356bafb846cb9f1fc9df71344f61f Parents: 5030d42 Author: Jonathan Ellis Authored: Thu Mar 13 13:04:14 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:04:14 2014 -0500 -- CHANGES.txt | 2 ++ src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a68e8ac..325c623 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.16 + * fix nodetool getsstables for blob PK (CASSANDRA-6803) * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) * Don't attempt cross-dc forwarding in mixed-version cluster with 1.1 @@ -21,6 +22,7 @@ * Fix bootstrapping when there is no schema (CASSANDRA-6685) * Fix truncating compression metadata (CASSANDRA-6791) + 1.2.15 * Move handling of migration event source to solve bootstrap race (CASSANDRA-6648) * Make sure compaction throughput value doesn't overflow with int math (CASSANDRA-6647) http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 3841397..9e6987d 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1370,7 +1370,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public List getSSTablesForKey(String key) { -DecoratedKey dk = new DecoratedKey(partitioner.getToken(ByteBuffer.wrap(key.getBytes())), ByteBuffer.wrap(key.getBytes())); +DecoratedKey dk = partitioner.decorateKey(metadata.getKeyValidator().fromString(key)); ViewFragment view = markReferenced(dk); try {
[3/3] git commit: merge from 1.2
merge from 1.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a3b31490 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a3b31490 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a3b31490 Branch: refs/heads/cassandra-2.0 Commit: a3b314908b5b7b666e7ce782ef4a1a91615847f1 Parents: 92b6d7a 52cf09d Author: Jonathan Ellis Authored: Thu Mar 13 13:05:11 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:05:11 2014 -0500 -- CHANGES.txt | 3 +++ src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3b31490/CHANGES.txt -- diff --cc CHANGES.txt index ce01e5c,325c623..2413b51 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,54 -1,9 +1,57 @@@ -1.2.16 +2.0.7 + * Fix saving triggers to schema (CASSANDRA-6789) + * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) + * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) + * Fix static counter columns (CASSANDRA-6827) + * Restore expiring->deleted (cell) compaction optimization (CASSANDRA-6844) + * Fix CompactionManager.needsCleanup (CASSANDRA-6845) + * Correctly compare BooleanType values other than 0 and 1 (CASSANDRA-6779) ++Merged from 1.2: + * fix nodetool getsstables for blob PK (CASSANDRA-6803) ++ + +2.0.6 + * Avoid race-prone second "scrub" of system keyspace (CASSANDRA-6797) + * Pool CqlRecordWriter clients by inetaddress rather than Range + (CASSANDRA-6665) + * Fix compaction_history timestamps (CASSANDRA-6784) + * Compare scores of full replica ordering in DES (CASSANDRA-6883) + * fix CME in SessionInfo updateProgress affecting netstats (CASSANDRA-6577) + * Allow repairing between specific replicas (CASSANDRA-6440) + * Allow per-dc enabling of hints (CASSANDRA-6157) + * Add compatibility for Hadoop 0.2.x (CASSANDRA-5201) + * Fix EstimatedHistogram races (CASSANDRA-6682) + * Failure detector correctly converts initial value to nanos (CASSANDRA-6658) + * Add nodetool taketoken to relocate vnodes (CASSANDRA-4445) + * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) + * Improve nodetool cfhistograms formatting (CASSANDRA-6360) + * Expose bulk loading progress over JMX (CASSANDRA-4757) + * Correctly handle null with IF conditions and TTL (CASSANDRA-6623) + * Account for range/row tombstones in tombstone drop + time histogram (CASSANDRA-6522) + * Stop CommitLogSegment.close() from calling sync() (CASSANDRA-6652) + * Make commitlog failure handling configurable (CASSANDRA-6364) + * Avoid overlaps in LCS (CASSANDRA-6688) + * Improve support for paginating over composites (CASSANDRA-4851) + * Fix count(*) queries in a mixed cluster (CASSANDRA-6707) + * Improve repair tasks(snapshot, differencing) concurrency (CASSANDRA-6566) + * Fix replaying pre-2.0 commit logs (CASSANDRA-6714) + * Add static columns to CQL3 (CASSANDRA-6561) + * Optimize single partition batch statements (CASSANDRA-6737) + * Disallow post-query re-ordering when paging (CASSANDRA-6722) + * Fix potential paging bug with deleted columns (CASSANDRA-6748) + * Fix NPE on BulkLoader caused by losing StreamEvent (CASSANDRA-6636) + * Fix truncating compression metadata (CASSANDRA-6791) + * Fix UPDATE updating PRIMARY KEY columns implicitly (CASSANDRA-6782) + * Fix IllegalArgumentException when updating from 1.2 with SuperColumns + (CASSANDRA-6733) + * FBUtilities.singleton() should use the CF comparator (CASSANDRA-6778) + * Fix CQLSStableWriter.addRow(Map) (CASSANDRA-6526) + * Fix HSHA server introducing corrupt data (CASSANDRA-6285) + * Fix CAS conditions for COMPACT STORAGE tables (CASSANDRA-6813) +Merged from 1.2: * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) - * Don't attempt cross-dc forwarding in mixed-version cluster with 1.1 - (CASSANDRA-6732) * Fix broken streams when replacing with same IP (CASSANDRA-6622) * Fix upgradesstables NPE for non-CF-based indexes (CASSANDRA-6645) * Fix partition and range deletes not triggering flush (CASSANDRA-6655) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a3b31490/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[2/3] git commit: fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803
fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/52cf09db Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/52cf09db Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/52cf09db Branch: refs/heads/cassandra-2.0 Commit: 52cf09dbace356bafb846cb9f1fc9df71344f61f Parents: 5030d42 Author: Jonathan Ellis Authored: Thu Mar 13 13:04:14 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:04:14 2014 -0500 -- CHANGES.txt | 2 ++ src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a68e8ac..325c623 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.16 + * fix nodetool getsstables for blob PK (CASSANDRA-6803) * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) * Don't attempt cross-dc forwarding in mixed-version cluster with 1.1 @@ -21,6 +22,7 @@ * Fix bootstrapping when there is no schema (CASSANDRA-6685) * Fix truncating compression metadata (CASSANDRA-6791) + 1.2.15 * Move handling of migration event source to solve bootstrap race (CASSANDRA-6648) * Make sure compaction throughput value doesn't overflow with int math (CASSANDRA-6647) http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 3841397..9e6987d 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1370,7 +1370,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public List getSSTablesForKey(String key) { -DecoratedKey dk = new DecoratedKey(partitioner.getToken(ByteBuffer.wrap(key.getBytes())), ByteBuffer.wrap(key.getBytes())); +DecoratedKey dk = partitioner.decorateKey(metadata.getKeyValidator().fromString(key)); ViewFragment view = markReferenced(dk); try {
[3/4] git commit: merge from 2.0
merge from 2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/105dcdbf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/105dcdbf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/105dcdbf Branch: refs/heads/cassandra-2.1 Commit: 105dcdbf320a068de9b28c4c182fd37bd1292906 Parents: e344314 a3b3149 Author: Jonathan Ellis Authored: Thu Mar 13 13:06:07 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:06:07 2014 -0500 -- CHANGES.txt | 8 src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/105dcdbf/CHANGES.txt -- diff --cc CHANGES.txt index 27cb235,2413b51..eb82a1c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,22 -1,16 +1,30 @@@ -2.0.7 +2.1.0-beta2 + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) +Merged from 2.0: ++ * fix nodetool getsstables for blob PK (CASSANDRA-6803) + * Fix saving triggers to schema (CASSANDRA-6789) + * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) + * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) + * Fix static counter columns (CASSANDRA-6827) + * Restore expiring->deleted (cell) compaction optimization (CASSANDRA-6844) + * Fix CompactionManager.needsCleanup (CASSANDRA-6845) + * Correctly compare BooleanType values other than 0 and 1 (CASSANDRA-6779) -Merged from 1.2: - * fix nodetool getsstables for blob PK (CASSANDRA-6803) - - -2.0.6 * Avoid race-prone second "scrub" of system keyspace (CASSANDRA-6797) * Pool CqlRecordWriter clients by inetaddress rather than Range (CASSANDRA-6665) http://git-wip-us.apache.org/repos/asf/cassandra/blob/105dcdbf/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[1/4] git commit: fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 fab550989 -> c34296581 fix nodetool getsstables for blob PK patch by Nate McCall; reviewed by jbellis for CASSANDRA-6803 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/52cf09db Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/52cf09db Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/52cf09db Branch: refs/heads/cassandra-2.1 Commit: 52cf09dbace356bafb846cb9f1fc9df71344f61f Parents: 5030d42 Author: Jonathan Ellis Authored: Thu Mar 13 13:04:14 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:04:14 2014 -0500 -- CHANGES.txt | 2 ++ src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a68e8ac..325c623 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.16 + * fix nodetool getsstables for blob PK (CASSANDRA-6803) * Add CMSClassUnloadingEnabled JVM option (CASSANDRA-6541) * Catch memtable flush exceptions during shutdown (CASSANDRA-6735) * Don't attempt cross-dc forwarding in mixed-version cluster with 1.1 @@ -21,6 +22,7 @@ * Fix bootstrapping when there is no schema (CASSANDRA-6685) * Fix truncating compression metadata (CASSANDRA-6791) + 1.2.15 * Move handling of migration event source to solve bootstrap race (CASSANDRA-6648) * Make sure compaction throughput value doesn't overflow with int math (CASSANDRA-6647) http://git-wip-us.apache.org/repos/asf/cassandra/blob/52cf09db/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 3841397..9e6987d 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1370,7 +1370,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public List getSSTablesForKey(String key) { -DecoratedKey dk = new DecoratedKey(partitioner.getToken(ByteBuffer.wrap(key.getBytes())), ByteBuffer.wrap(key.getBytes())); +DecoratedKey dk = partitioner.decorateKey(metadata.getKeyValidator().fromString(key)); ViewFragment view = markReferenced(dk); try {
[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed
[ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933665#comment-13933665 ] Benedict commented on CASSANDRA-6746: - It seems like all of the discussion on the prior ticket CASSANDRA-1470 that introduced it circles around how to implement it, and not on whether implementing it is actually necessary. It seems to be taken as read that it is, but I'm not totally convinced by that. Either way, probably the best long term solution that would definitely work is to perform the incremental replacement I previously suggested, as this would allow us to DONTNEED the old sstables incrementally, thereby saving at minimum as much memory churn as we can save optimally with this approach, and then leave the new pages to the OS to decide what to do with. If they're not hot the newly freed memory from the old tables should give plenty enough room for the regular ageing algorithm to kick in and ensure they're selected for eviction in preference to anything that is in use; it also bounds how much of the system memory can churn, which is currently unbounded (although large tricklefsync would achieve this also). > Reads have a slow ramp up in speed > -- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ryan McGuire >Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, > cassandra-2.0-bdplab-trial-fincore.tar.bz2, > cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. > The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data > here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6803) nodetool getsstables fails with 'blob' type primary keys
[ https://issues.apache.org/jira/browse/CASSANDRA-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933667#comment-13933667 ] Jonathan Ellis commented on CASSANDRA-6803: --- committed; thanks! > nodetool getsstables fails with 'blob' type primary keys > > > Key: CASSANDRA-6803 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6803 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Nate McCall >Assignee: Nate McCall > Fix For: 1.2.16, 2.0.7 > > Attachments: sstables_for_key_blob_support.txt, > sstables_for_key_blob_support_2.0.txt > > > Trivial fix, just need to get the bytebuffer from the CfMetaData's key > validator as opposed to just calling String#getBytes (which breaks for keys > of BytesType). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6824) fix help text for stress counterwrite
[ https://issues.apache.org/jira/browse/CASSANDRA-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933670#comment-13933670 ] Jonathan Ellis commented on CASSANDRA-6824: --- ([~xedin] to review) > fix help text for stress counterwrite > - > > Key: CASSANDRA-6824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6824 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Priority: Trivial > Fix For: 2.1 > > Attachments: 6824-2.txt, trunk-6824.txt > > > the help output for counterwrite shows 'counteradd' in the syntax instead of > 'counterwrite'. > {noformat} > rhatch@whatup:~/git/cstar/cassandra/tools$ ./bin/cassandra-stress help > counterwrite > Usage: counteradd [err?] [n OR > Usage: counteradd n=? [tries=?] [ignore_errors] [cl=?] > err the mean is below this fraction > n>? (default=30) Run at least this many iterations > before accepting uncertainty convergence > n before accepting uncertainty convergence > tries=? (default=9) Number of tries to perform for > each operation before failing > ignore_errorsDo not print/log errors > cl=? (default=ONE) Consistency level to use > n=? Number of operations to perform > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6850) cqlsh won't startup
[ https://issues.apache.org/jira/browse/CASSANDRA-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933672#comment-13933672 ] Mikhail Stepura commented on CASSANDRA-6850: [~chander] I've committed the new driver's version into {{cassandra-2.1}}. Could you please check if that works for you? > cqlsh won't startup > --- > > Key: CASSANDRA-6850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6850 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chander S Pechetty >Assignee: Mikhail Stepura >Priority: Minor > Labels: cqlsh > > Scales library is missing > {noformat} > ../lib/cassandra-driver-internal-only-1.0.2.post.zip/cassandra-driver-1.0.2.post/cassandra/metrics.py", > line 4, in > ImportError: No module named greplin > {noformat} > It would be useful if users don't have to install scales explicitly for the > functioning of cqlsh -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-6847) The binary transport doesn't load truststore file
[ https://issues.apache.org/jira/browse/CASSANDRA-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura reassigned CASSANDRA-6847: -- Assignee: Mikhail Stepura > The binary transport doesn't load truststore file > - > > Key: CASSANDRA-6847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6847 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Stepura >Assignee: Mikhail Stepura >Priority: Minor > Labels: ssl > > {code:title=org.apache.cassandra.transport.Server.SecurePipelineFactory} > this.sslContext = SSLFactory.createSSLContext(encryptionOptions, false); > {code} > {{false}} there means that {{truststore}} file won't be loaded in any case. > And that means that the file will not be used to validate clients when > {{require_client_auth==true}}, making > http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureNewTrustedUsers_t.html > meaningless. > The only way to workaround that currently is to start C* with > {{-Djavax.net.ssl.trustStore=conf/.truststore}} > I believe we should load {{truststore}} when {{require_client_auth==true}}, -- This message was sent by Atlassian JIRA (v6.2#6252)
[3/3] git commit: merge from 2.0
merge from 2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4c22b165 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4c22b165 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4c22b165 Branch: refs/heads/cassandra-2.1 Commit: 4c22b165cb5ba89f0454f563eeb76e4de2b01367 Parents: a2ceb22 da2d971 Author: Jonathan Ellis Authored: Thu Mar 13 13:14:15 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:14:15 2014 -0500 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/streaming/StreamWriter.java | 8 2 files changed, 5 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4c22b165/CHANGES.txt -- diff --cc CHANGES.txt index eb82a1c,e208e21..6a44d80 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,23 -1,5 +1,24 @@@ -2.0.7 +2.1.0-beta2 + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) +Merged from 2.0: + * Fix leaking validator FH in StreamWriter (CASSANDRA-6832) + * fix nodetool getsstables for blob PK (CASSANDRA-6803) * Fix saving triggers to schema (CASSANDRA-6789) * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) http://git-wip-us.apache.org/repos/asf/cassandra/blob/4c22b165/src/java/org/apache/cassandra/streaming/StreamWriter.java --
[2/3] git commit: Fix leaking validator FH in StreamWriter patch by Joshua McKenzie; reviewed by jbellis for CASSANDRA-6832
Fix leaking validator FH in StreamWriter patch by Joshua McKenzie; reviewed by jbellis for CASSANDRA-6832 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/da2d9714 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/da2d9714 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/da2d9714 Branch: refs/heads/cassandra-2.1 Commit: da2d97142b9c76e5fb81df5f94c3d52ef46bf244 Parents: a3b3149 Author: Jonathan Ellis Authored: Thu Mar 13 13:13:48 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:13:48 2014 -0500 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/streaming/StreamWriter.java | 8 2 files changed, 5 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/da2d9714/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2413b51..e208e21 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.7 + * Fix leaking validator FH in StreamWriter (CASSANDRA-6832) * Fix saving triggers to schema (CASSANDRA-6789) * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) http://git-wip-us.apache.org/repos/asf/cassandra/blob/da2d9714/src/java/org/apache/cassandra/streaming/StreamWriter.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamWriter.java b/src/java/org/apache/cassandra/streaming/StreamWriter.java index 04301ba..dbc7390 100644 --- a/src/java/org/apache/cassandra/streaming/StreamWriter.java +++ b/src/java/org/apache/cassandra/streaming/StreamWriter.java @@ -71,10 +71,9 @@ public class StreamWriter { long totalSize = totalSize(); RandomAccessReader file = sstable.openDataReader(); -ChecksumValidator validator = null; -if (new File(sstable.descriptor.filenameFor(Component.CRC)).exists()) -validator = DataIntegrityMetadata.checksumValidator(sstable.descriptor); - +ChecksumValidator validator = new File(sstable.descriptor.filenameFor(Component.CRC)).exists() +? DataIntegrityMetadata.checksumValidator(sstable.descriptor) +: null; transferBuffer = validator == null ? new byte[DEFAULT_CHUNK_SIZE] : new byte[validator.chunkSize]; // setting up data compression stream @@ -114,6 +113,7 @@ public class StreamWriter { // no matter what happens close file FileUtils.closeQuietly(file); +FileUtils.closeQuietly(validator); } // release reference only when completed successfully
[1/3] git commit: Fix leaking validator FH in StreamWriter patch by Joshua McKenzie; reviewed by jbellis for CASSANDRA-6832
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 a3b314908 -> da2d97142 refs/heads/cassandra-2.1 a2ceb22c5 -> 4c22b165c Fix leaking validator FH in StreamWriter patch by Joshua McKenzie; reviewed by jbellis for CASSANDRA-6832 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/da2d9714 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/da2d9714 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/da2d9714 Branch: refs/heads/cassandra-2.0 Commit: da2d97142b9c76e5fb81df5f94c3d52ef46bf244 Parents: a3b3149 Author: Jonathan Ellis Authored: Thu Mar 13 13:13:48 2014 -0500 Committer: Jonathan Ellis Committed: Thu Mar 13 13:13:48 2014 -0500 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/streaming/StreamWriter.java | 8 2 files changed, 5 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/da2d9714/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2413b51..e208e21 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.7 + * Fix leaking validator FH in StreamWriter (CASSANDRA-6832) * Fix saving triggers to schema (CASSANDRA-6789) * Fix trigger mutations when base mutation list is immutable (CASSANDRA-6790) * Fix accounting in FileCacheService to allow re-using RAR (CASSANDRA-6838) http://git-wip-us.apache.org/repos/asf/cassandra/blob/da2d9714/src/java/org/apache/cassandra/streaming/StreamWriter.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamWriter.java b/src/java/org/apache/cassandra/streaming/StreamWriter.java index 04301ba..dbc7390 100644 --- a/src/java/org/apache/cassandra/streaming/StreamWriter.java +++ b/src/java/org/apache/cassandra/streaming/StreamWriter.java @@ -71,10 +71,9 @@ public class StreamWriter { long totalSize = totalSize(); RandomAccessReader file = sstable.openDataReader(); -ChecksumValidator validator = null; -if (new File(sstable.descriptor.filenameFor(Component.CRC)).exists()) -validator = DataIntegrityMetadata.checksumValidator(sstable.descriptor); - +ChecksumValidator validator = new File(sstable.descriptor.filenameFor(Component.CRC)).exists() +? DataIntegrityMetadata.checksumValidator(sstable.descriptor) +: null; transferBuffer = validator == null ? new byte[DEFAULT_CHUNK_SIZE] : new byte[validator.chunkSize]; // setting up data compression stream @@ -114,6 +113,7 @@ public class StreamWriter { // no matter what happens close file FileUtils.closeQuietly(file); +FileUtils.closeQuietly(validator); } // release reference only when completed successfully
[jira] [Commented] (CASSANDRA-6832) File handle leak in StreamWriter.java
[ https://issues.apache.org/jira/browse/CASSANDRA-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933681#comment-13933681 ] Jonathan Ellis commented on CASSANDRA-6832: --- committed to 2.0.7 (assuming it doesn't affect 1.2) > File handle leak in StreamWriter.java > - > > Key: CASSANDRA-6832 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6832 > Project: Cassandra > Issue Type: Bug > Environment: Windows, 2.0.5, leakdetect patch >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > Fix For: 2.0.7 > > Attachments: 6832_v1.patch > > > Reference CASSANDRA-6283 where this first came up. nodetool.bat repair -par > on 2.0.5 pops up the following stack: > ERROR [Finalizer] 2014-02-17 09:21:52,922 RandomAccessReader.java (line 399) > LEAK finalizer had to clean up > java.lang.Exception: RAR for > C:\var\lib\cassandra\data\Keyspace1\Standard1\Keyspace1-Standard1-jb-41-CRC.db > allocated > at > org.apache.cassandra.io.util.RandomAccessReader.(RandomAccessReader.java:66) > at > org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:106) > at > org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:98) > at > org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.(DataIntegrityMetadata.java:53) > at > org.apache.cassandra.io.util.DataIntegrityMetadata.checksumValidator(DataIntegrityMetadata.java:40) > at > org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:76) > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59) > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) > at > org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383) > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355) > at java.lang.Thread.run(Thread.java:744) > This leak doesn't look like it's breaking anything but is still worth fixing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6824) fix help text for stress counterwrite
[ https://issues.apache.org/jira/browse/CASSANDRA-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6824: -- Reviewer: Pavel Yaskevich > fix help text for stress counterwrite > - > > Key: CASSANDRA-6824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6824 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Priority: Trivial > Fix For: 2.1 > > Attachments: 6824-2.txt, trunk-6824.txt > > > the help output for counterwrite shows 'counteradd' in the syntax instead of > 'counterwrite'. > {noformat} > rhatch@whatup:~/git/cstar/cassandra/tools$ ./bin/cassandra-stress help > counterwrite > Usage: counteradd [err?] [n OR > Usage: counteradd n=? [tries=?] [ignore_errors] [cl=?] > err the mean is below this fraction > n>? (default=30) Run at least this many iterations > before accepting uncertainty convergence > n before accepting uncertainty convergence > tries=? (default=9) Number of tries to perform for > each operation before failing > ignore_errorsDo not print/log errors > cl=? (default=ONE) Consistency level to use > n=? Number of operations to perform > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933802#comment-13933802 ] Russ Hatch commented on CASSANDRA-6848: --- happens every time on my austin test cluster (stress-2.1 targeting various versions) happens more sporadically locally, but once it happens for a given cluster it seems to happen every time. I'm getting it to reproduce today with stress from 2.1, and a ccm cluster built from 2.1. Happens using the stress counterwrite command I mentioned two comments above this one. > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933804#comment-13933804 ] Russ Hatch commented on CASSANDRA-6848: --- To add 2 cents to the discussion above, it will be a great help if we can expect stress-2.1 to work against 2.0 clusters so we can make meaningful comparisons between these versions (otherwise it's going to be apples to oranges). But I'm getting this to happen with stress-2.1 targeting a 2.1 cluster as well, so I think version may have nothing to do with it (or maybe two different problems are manifesting as the exceptions output to the console). > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed
[ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933805#comment-13933805 ] Jonathan Ellis commented on CASSANDRA-6746: --- The 2635 issue description is from a production cluster. ("We've applied this patch locally in order to turn of page skipping... It's better than completely disabling DONTNEED because the cache skipping does make sense and has no relevant (that I can see) detrimental effects in some cases, like when dumping caches.") This replaced an approach of DONTNEEDing the old sstables, which would crater read requests since old sstables will still be in use during compaction before the new ones are completely live. > Reads have a slow ramp up in speed > -- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ryan McGuire >Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, > cassandra-2.0-bdplab-trial-fincore.tar.bz2, > cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. > The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data > here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6832) File handle leak in StreamWriter.java
[ https://issues.apache.org/jira/browse/CASSANDRA-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933808#comment-13933808 ] Joshua McKenzie commented on CASSANDRA-6832: It's part of "streaming 2.0" so we should be clear. > File handle leak in StreamWriter.java > - > > Key: CASSANDRA-6832 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6832 > Project: Cassandra > Issue Type: Bug > Environment: Windows, 2.0.5, leakdetect patch >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > Fix For: 2.0.7 > > Attachments: 6832_v1.patch > > > Reference CASSANDRA-6283 where this first came up. nodetool.bat repair -par > on 2.0.5 pops up the following stack: > ERROR [Finalizer] 2014-02-17 09:21:52,922 RandomAccessReader.java (line 399) > LEAK finalizer had to clean up > java.lang.Exception: RAR for > C:\var\lib\cassandra\data\Keyspace1\Standard1\Keyspace1-Standard1-jb-41-CRC.db > allocated > at > org.apache.cassandra.io.util.RandomAccessReader.(RandomAccessReader.java:66) > at > org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:106) > at > org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:98) > at > org.apache.cassandra.io.util.DataIntegrityMetadata$ChecksumValidator.(DataIntegrityMetadata.java:53) > at > org.apache.cassandra.io.util.DataIntegrityMetadata.checksumValidator(DataIntegrityMetadata.java:40) > at > org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:76) > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59) > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) > at > org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383) > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:355) > at java.lang.Thread.run(Thread.java:744) > This leak doesn't look like it's breaking anything but is still worth fixing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6206) Thrift socket listen backlog
[ https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933811#comment-13933811 ] Jonathan Ellis commented on CASSANDRA-6206: --- Ping [~xedin] > Thrift socket listen backlog > > > Key: CASSANDRA-6206 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6206 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian Linux, Java 7 >Reporter: Nenad Merdanovic > Fix For: 2.0.7 > > Attachments: cassandra-v2.patch, cassandra.patch > > > Although Thrift is a depreciated method of accessing Cassandra, default > backlog is way too low on that socket. It shouldn't be a problem to implement > it and I am including a POC patch for this (sorry, really low on time with > limited Java knowledge so just to give an idea). > This is an old report which was never addressed and the bug remains till this > day, except in my case I have a much larger scale application with 3rd party > software which I cannot modify to include connection pooling: > https://issues.apache.org/jira/browse/CASSANDRA-1663 > There is also a pending change in the Thrift itself which Cassandra should be > able to use for parts using TServerSocket (SSL): > https://issues.apache.org/jira/browse/THRIFT-1868 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed
[ https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933823#comment-13933823 ] Benedict commented on CASSANDRA-6746: - But that's the opposite issue, surely? They found disabling DONTNEED was good - for exactly the same reason, that their OS was obeying it unequivocally - not that _enabling_ it (for the opposite situation, flushing) was beneficial. > Reads have a slow ramp up in speed > -- > > Key: CASSANDRA-6746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6746 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Ryan McGuire >Assignee: Benedict > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 2.1_vs_2.0_read.png, 6746-patched.png, 6746.txt, > cassandra-2.0-bdplab-trial-fincore.tar.bz2, > cassandra-2.1-bdplab-trial-fincore.tar.bz2 > > > On a physical four node cluister I am doing a big write and then a big read. > The read takes a long time to ramp up to respectable speeds. > !2.1_vs_2.0_read.png! > [See data > here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6847) The binary transport doesn't load truststore file
[ https://issues.apache.org/jira/browse/CASSANDRA-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-6847: --- Attachment: cassandra-2.0-6847.patch Attaching a trivial patch > The binary transport doesn't load truststore file > - > > Key: CASSANDRA-6847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6847 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Stepura >Assignee: Mikhail Stepura >Priority: Minor > Labels: ssl > Attachments: cassandra-2.0-6847.patch > > > {code:title=org.apache.cassandra.transport.Server.SecurePipelineFactory} > this.sslContext = SSLFactory.createSSLContext(encryptionOptions, false); > {code} > {{false}} there means that {{truststore}} file won't be loaded in any case. > And that means that the file will not be used to validate clients when > {{require_client_auth==true}}, making > http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureNewTrustedUsers_t.html > meaningless. > The only way to workaround that currently is to start C* with > {{-Djavax.net.ssl.trustStore=conf/.truststore}} > I believe we should load {{truststore}} when {{require_client_auth==true}}, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6666) Avoid accumulating tombstones after partial hint replay
[ https://issues.apache.org/jira/browse/CASSANDRA-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933837#comment-13933837 ] Jonathan Ellis commented on CASSANDRA-: --- Can you attach a log where this happens with debug enabled on o.a.c.db.compaction? > Avoid accumulating tombstones after partial hint replay > --- > > Key: CASSANDRA- > URL: https://issues.apache.org/jira/browse/CASSANDRA- > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jonathan Ellis >Assignee: Jonathan Ellis >Priority: Minor > Labels: hintedhandoff > Fix For: 1.2.16, 2.0.6 > > Attachments: .txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6824) fix help text for stress counterwrite
[ https://issues.apache.org/jira/browse/CASSANDRA-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933853#comment-13933853 ] Benedict commented on CASSANDRA-6824: - [~xedin] I've pushed a [branch|https://github.com/belliottsmith/cassandra/tree/iss-6824] merged with CASSANDRA-6835 to make things easier for you, as they would conflict. > fix help text for stress counterwrite > - > > Key: CASSANDRA-6824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6824 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Priority: Trivial > Fix For: 2.1 > > Attachments: 6824-2.txt, trunk-6824.txt > > > the help output for counterwrite shows 'counteradd' in the syntax instead of > 'counterwrite'. > {noformat} > rhatch@whatup:~/git/cstar/cassandra/tools$ ./bin/cassandra-stress help > counterwrite > Usage: counteradd [err?] [n OR > Usage: counteradd n=? [tries=?] [ignore_errors] [cl=?] > err the mean is below this fraction > n>? (default=30) Run at least this many iterations > before accepting uncertainty convergence > n before accepting uncertainty convergence > tries=? (default=9) Number of tries to perform for > each operation before failing > ignore_errorsDo not print/log errors > cl=? (default=ONE) Consistency level to use > n=? Number of operations to perform > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933852#comment-13933852 ] Edward Capriolo commented on CASSANDRA-6846: [~tupshin] https://www.apache.org/foundation/voting.html {quote} but -1 votes are vetos and kill the proposal dead until all vetoers withdraw their -1 votes. {quote} This has been voted dead. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933865#comment-13933865 ] Tupshin Harper commented on CASSANDRA-6846: --- I'm not concerned about that. Votes can be changed, and this would be in principle, and probably in practice, be a brand new ticket anyway. Let's focus on the issues at hand, people. :) > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6847) The binary transport doesn't load truststore file
[ https://issues.apache.org/jira/browse/CASSANDRA-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933879#comment-13933879 ] Jason Brown commented on CASSANDRA-6847: Yup, that was gonna be my patch :). +1 > The binary transport doesn't load truststore file > - > > Key: CASSANDRA-6847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6847 > Project: Cassandra > Issue Type: Bug >Reporter: Mikhail Stepura >Assignee: Mikhail Stepura >Priority: Minor > Labels: ssl > Attachments: cassandra-2.0-6847.patch > > > {code:title=org.apache.cassandra.transport.Server.SecurePipelineFactory} > this.sslContext = SSLFactory.createSSLContext(encryptionOptions, false); > {code} > {{false}} there means that {{truststore}} file won't be loaded in any case. > And that means that the file will not be used to validate clients when > {{require_client_auth==true}}, making > http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureNewTrustedUsers_t.html > meaningless. > The only way to workaround that currently is to start C* with > {{-Djavax.net.ssl.trustStore=conf/.truststore}} > I believe we should load {{truststore}} when {{require_client_auth==true}}, -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933883#comment-13933883 ] Edward Capriolo commented on CASSANDRA-6846: Your points are great. Although I feel they are practices that Cassandra should already be engaged in. #2 and #3 are way too high level. Sitting on top of triggers and CQL is hardly "deep integrations" that can already be done client side. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933890#comment-13933890 ] Russ Hatch commented on CASSANDRA-6848: --- I was able to isolate this a little bit more. Looks like if the first stress command I run is the 'counterwrite' above, then the cluster is no good for running stress after that (for even just a simple write n=10). Seems like there's some state being preserved by stress or the cluster since the outcome of a previous command is having an impact on subsequent runs. However, if I run a simple write test first (write n=10), then follow with my counterwrite test the problem seems to go away. > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933888#comment-13933888 ] Tupshin Harper commented on CASSANDRA-6846: --- So let's continue to use this ticket to explore what the gap is between what I proposed, and what you see as a minimally viable versions of the need expressed by #2 and #3. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933888#comment-13933888 ] Tupshin Harper edited comment on CASSANDRA-6846 at 3/13/14 6:58 PM: So let's continue to use this ticket to explore what the gap is between what I proposed, and what you see as a minimally viable versions of the need expressed by #2 and #3. FWIW, I do still think that UDFs can fulfill your goals for wide partition scanners, if defined capably enough. was (Author: tupshin): So let's continue to use this ticket to explore what the gap is between what I proposed, and what you see as a minimally viable versions of the need expressed by #2 and #3. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6846) Provide standard interface for deep application server integration
[ https://issues.apache.org/jira/browse/CASSANDRA-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933893#comment-13933893 ] Russell Bradberry commented on CASSANDRA-6846: -- [~appodictic] {quote} To prevent vetos from being used capriciously, they must be accompanied by a technical justification showing why the change is bad (opens a security exposure, negatively affects performance, etc. ). A veto without a justification is invalid and has no weight. {quote} The veto was accompanied only by an opinion, not a technical justification. > Provide standard interface for deep application server integration > -- > > Key: CASSANDRA-6846 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6846 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Tupshin Harper >Assignee: Tupshin Harper >Priority: Minor > Labels: ponies > > Instead of creating a pluggable interface for Thrift, I'd like to create a > pluggable interface for arbitrary app-server deep integration. > Inspired by both the existence of intravert-ug, as well as there being a long > history of various parties embedding tomcat or jetty servlet engines inside > Cassandra, I'd like to propose the creation an internal somewhat stable > (versioned?) interface that could allow any app server to achieve deep > integration with Cassandra, and as a result, these servers could > 1) host their own apis (REST, for example > 2) extend core functionality by having limited (see triggers and wide row > scanners) access to the internals of cassandra > The hand wavey part comes because while I have been mulling this about for a > while, I have not spent any significant time into looking at the actual > surface area of intravert-ug's integration. But, using it as a model, and > also keeping in minds the general needs of your more traditional servlet/j2ee > containers, I believe we could come up with a reasonable interface to allow > any jvm app server to be integrated and maintained in or out of the Cassandra > tree. > This would satisfy the needs that many of us (Both Ed and I, for example) to > have a much greater degree of control over server side execution, and to be > able to start building much more interestingly (and simply) tiered > applications. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6849) Add verbose logging to cassandra-stress
[ https://issues.apache.org/jira/browse/CASSANDRA-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933896#comment-13933896 ] Benedict commented on CASSANDRA-6849: - Simple patch [here|https://github.com/belliottsmith/cassandra/tree/iss-6849], based on CASSANDRA-6835 and CASSANDRA-6824, to avoid merge nightmares. > Add verbose logging to cassandra-stress > --- > > Key: CASSANDRA-6849 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6849 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Fix For: 2.1 beta2 > > > Debugging unexpected errors with stress is tough, as they're truncated to the > message only. If we introduce an option to log the full exception then when a > problem isn't easily reproducible we can still maybe get enough information > to figure out what's gone wrong. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6848) stress (2.1) spams console with java.util.NoSuchElementException when run against nodes recently created
[ https://issues.apache.org/jira/browse/CASSANDRA-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933900#comment-13933900 ] Benedict commented on CASSANDRA-6848: - Weird. Stress retains no state. Can you try running stress from the branch I uploaded for CASSANDRA-6849, and passing "-log level=verbose" ? > stress (2.1) spams console with java.util.NoSuchElementException when run > against nodes recently created > > > Key: CASSANDRA-6848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6848 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Benedict > Fix For: 2.1 beta2 > > > I don't get any stack trace on the console, but I get two > java.util.NoSuchElementException for each operation stress is doing. > This seems to occur when stress is being run against a recently created node > (such as one from ccm). > To reproduce: create a ccm cluster, and run stress against it within a few > minutes . Run a simple stress command like cassandra-stress write n=10 . -- This message was sent by Atlassian JIRA (v6.2#6252)