date:20141016


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174729#comment-14174729
 ] 

Minh Do commented on CASSANDRA-8132:


Brandon, I mean it is the other way around to stream hints from the node about 
to be replaced to one of its neighbors.  It is just like in unbootstrap() that 
we have to stream hints from the closest node prior to the shutdown.

We need to do this because we don't want to lose hints in shutting down a node 
and replacing it with a new instance or machine.

> Save or stream hints to a safe place in node replacement
> 
>
> Key: CASSANDRA-8132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
> Fix For: 2.1.1
>
>
> Often, we need to replace a node with a new instance in the cloud environment 
> where we have all nodes are still alive. To be safe without losing data, we 
> usually make sure all hints are gone before we do this operation.
> Replacement means we just want to shutdown C* process on a node and bring up 
> another instance to take over that node's token.
> However, if a node to be replaced has a lot of stored hints, its 
> HintedHandofManager seems very slow to send the hints to other nodes.  In our 
> case, we tried to replace a node and had to wait for several days before its 
> stored hints are clear out.  As mentioned above, we need all hints on this 
> node to clear out before we can terminate it and replace it by a new 
> instance/machine.
> Since this is not a decommission, I am proposing that we have the same 
> hints-streaming mechanism as in the decommission code.  Furthermore, there 
> needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-8132:
---
Description: 
Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node to be replaced has a lot of stored hints, its 
HintedHandofManager seems very slow to send the hints to other nodes.  In our 
case, we tried to replace a node and had to wait for several days before its 
stored hints are clear out.  As mentioned above, we need all hints on this node 
to clear out before we can terminate it and replace it by a new 
instance/machine.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.


  was:
Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node to be replaced has a lot of stored hints, its 
HintedHandofManager seems very slow to send the hints to other nodes.  In our 
case, we tried to replace a node and had to wait for several days before its 
stored hints are clear out.  As mentioned above, we need all hints on this node 
to clear out before we can terminate it and replace it by a new node.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.



> Save or stream hints to a safe place in node replacement
> 
>
> Key: CASSANDRA-8132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
> Fix For: 2.1.1
>
>
> Often, we need to replace a node with a new instance in the cloud environment 
> where we have all nodes are still alive. To be safe without losing data, we 
> usually make sure all hints are gone before we do this operation.
> Replacement means we just want to shutdown C* process on a node and bring up 
> another instance to take over that node's token.
> However, if a node to be replaced has a lot of stored hints, its 
> HintedHandofManager seems very slow to send the hints to other nodes.  In our 
> case, we tried to replace a node and had to wait for several days before its 
> stored hints are clear out.  As mentioned above, we need all hints on this 
> node to clear out before we can terminate it and replace it by a new 
> instance/machine.
> Since this is not a decommission, I am proposing that we have the same 
> hints-streaming mechanism as in the decommission code.  Furthermore, there 
> needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-8132:
---
Description: 
Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node to be replaced has a lot of stored hints, its 
HintedHandofManager seems very slow to send the hints to other nodes.  In our 
case, we tried to replace a node and had to wait for several days before its 
stored hints are clear out.  As mentioned above, we need all hints on this node 
to clear out before we can terminate it and replace it by a new node.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.


  was:
Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node has a lot of stored hints, HintedHandofManager seems very 
slow to play the hints.  In our case, we tried to replace a node and had to 
wait for several days.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.



> Save or stream hints to a safe place in node replacement
> 
>
> Key: CASSANDRA-8132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
> Fix For: 2.1.1
>
>
> Often, we need to replace a node with a new instance in the cloud environment 
> where we have all nodes are still alive. To be safe without losing data, we 
> usually make sure all hints are gone before we do this operation.
> Replacement means we just want to shutdown C* process on a node and bring up 
> another instance to take over that node's token.
> However, if a node to be replaced has a lot of stored hints, its 
> HintedHandofManager seems very slow to send the hints to other nodes.  In our 
> case, we tried to replace a node and had to wait for several days before its 
> stored hints are clear out.  As mentioned above, we need all hints on this 
> node to clear out before we can terminate it and replace it by a new node.
> Since this is not a decommission, I am proposing that we have the same 
> hints-streaming mechanism as in the decommission code.  Furthermore, there 
> needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-16 Thread Nikolai Grigoriev (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174702#comment-14174702
]

Nikolai Grigoriev edited comment on CASSANDRA-7949 at 10/17/14 3:57 AM:

Update:

Using the property from CASSANDRA-6621 does help to get out of this state. My
cluster is slowly digesting the large sstables and creating bunch of nice small
sstables from them. It is slower than using sstablesplit, I believe, because it
actually does real compactions and, thus, processes and reprocesses different
sets of sstables. My understanding is that every time I get new bunch of L0
sstables there is a phase for updating other levels and it repeats and repeats.

With that property set I see that my total number of sstables grows, my number
of "huge" sstables decreases and the average size of the sstable decreases as
result.

My conclusions so far:

1. STCS fallback in LCS is a double-edged sword. It is needed to prevent the
flooding the node with tons of small sstables resulting from ongoing writes.
These small ones are often much smaller than the configured target size and hey
need to be merged. But also the use of STCS results in generation of the
super-sized sstables. These become a large headache when the fallback stops and
LCS is supposed to resume normal operations. It appears to me (my humble
opinion) that fallback should be done to some kind of specialized "rescue" STCS
flavor that merges the small sstables to approximately the LCS target sstable
size BUT DOES NOT create sstables that are much larger than the target size.
With this approach the LCS will resume normal operations much faster than the
cause for the fallback (abnormally high write load) is gone.

2. LCS has major (performance?) issue when you have super-large sstables in the
system. It often gets stuck with single long (many hours) compaction stream
that, by itself, will increase the probability of another STCS fallback even
with reasonable write load. As a possible workaround I was recommended to
consider running multiple C* instances on our relatively powerful machines - to
significantly reduce the amount of data per node and increase compaction
throughput.

3. In the existing systems, depending on the severity of the STCS fallback
"work", the fix from CASSANDRA-6621 may help to recover while keeping the nodes
up. It will take a very long time to recover but the nodes will be online.

4. Recovery (see above) is very long. It is much much longer than the duration
of the "stress period" that causes the condition. In my case I was writing like
crazy for about 4 days and it's been over a week of compactions after that. I
am still very far from 0 pending compactions. Considering this it makes sense
to artificially throttle the write speed when generating the data (like in the
use case I described in previous comments). Extra time spent on writing the
data will be still significantly shorter than the amount of time required to
recover from the consequences of abusing the available write bandwidth.

was (Author: ngrigor...@gmail.com):
Update:

With that property set I see that my total number of sstables grows, my number
of "huge" sstables decreases and the average size of the sstable decreases as
result.

My conclusions so far:

[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-16 Thread Nikolai Grigoriev (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174702#comment-14174702
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

Update:

Using the property from CASSANDRA-6621 does help to get out of this state. My 
cluster is slowly digesting the large sstables and creating bunch of nice small 
sstables from them. It is slower than using sstablesplit, I believe, because it 
actually does real compactions and, thus, processes and reprocesses different 
sets of sstables. My understanding is that every time I get new bunch of L0 
sstables there is a phase for updating other levels and it repeats and repeats.

With that property set I see that my total number of sstables grows, my number 
of "huge" sstables decreases and the average size of the sstable decreases as 
result.

My conclusions so far:

1. STCS fallback in LCS is a double-edged sword. It is needed to prevent the 
flooding the node with tons of small sstables resulting from ongoing writes. 
These small ones are often much smaller than the configured target size and hey 
need to be merged. But also the use of STCS results in generation of the 
super-sized sstables. These become a large headache when the fallback stops and 
LCS is supposed to resume normal operations.  It appears to me (my humble 
opinion) that fallback should be done to some kind of specialized "rescue" STCS 
flavor that merges the small sstables to approximately the LCS target sstable 
size BUT DOES NOT create sstables that are much larger than the target size. 
With this approach the LCS will resume normal operations much faster than the 
cause for the fallback (abnormally high write load) is gone.

2. LCS has major (performance?) issue when you have super-large sstables in the 
system. It often gets stuck with single long (many hours) compaction stream 
that, by itself, will increase the probability of another STCS fallback even 
with reasonable write load. As a possible workaround I was recommended to 
consider running multiple C* instances on our relatively powerful machines - to 
significantly reduce the amount of data per node and increase compaction 
throughput.

3. In the existing systems, depending on the severity of the STCS fallback 
"work" the fix from CASSANDRA-6621 may help to recover while keeping the nodes 
up. It will take a very long time to recover but the nodes will be online.

4. Recovery (see above) is very long. It is much much longer than the duration 
of the "stress period" that causes the condition. In my case I was writing like 
crazy for about 4 days and it's been over a week of compactions after that. I 
am still very far from 0 pending compactions. Considering this it makes sense 
to artificially throttle the write speed when generating the data (like in the 
use case I described in previous comments). Extra time spent on writing the 
data will be still significantly  shorter than the amount of time required to 
recover from the consequences of abusing the available write bandwidth.

> LCS compaction low performance, many pending compactions, nodes are almost 
> idle
> ---
>
> Key: CASSANDRA-7949
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE 4.5.1-1, Cassandra 2.0.8
>Reporter: Nikolai Grigoriev
> Attachments: iostats.txt, nodetool_compactionstats.txt, 
> nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt
>
>
> I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
> 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
> load similar to the load in our future product. Before running the simulator 
> I had to pre-generate enough data. This was done using Java code and DataStax 
> Java driver. To avoid going deep into details, two tables have been 
> generated. Each table currently has about 55M rows and between few dozens and 
> few thousands of columns in each row.
> This data generation process was generating massive amount of non-overlapping 
> data. Thus, the activity was write-only and highly parallel. This is not the 
> type of the traffic that the system will have ultimately to deal with, it 
> will be mix of reads and updates to the existing data in the future. This is 
> just to explain the choice of LCS, not mentioning the expensive SSD disk 
> space.
> At some point while generating the data I have noticed that the compactions 
> started to pile up. I knew that I was overloading the cluster but I still 
> wanted the genration test to complete. I was expecting to give the cluster 
> enough time to finish the pending compactions and get ready for real traffic.
> However, after t

[jira] [Commented] (CASSANDRA-8129) Increase max heap for sstablesplit

2014-10-16 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174637#comment-14174637
 ] 

Jonathan Ellis commented on CASSANDRA-8129:
---

It shouldn't have to read large parts of the file into memory, though.  What 
makes it OOM?

> Increase max heap for sstablesplit
> --
>
> Key: CASSANDRA-8129
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8129
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Matt Stump
>Priority: Minor
>
> The max heap for sstablesplit is 256m. For large files that's too small and 
> it will OOM. We should increase the max heap to something like 2-4G with the 
> understanding that sstablesplit will only most likely be invoked to split 
> large files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174613#comment-14174613
 ] 

Michael Shuler commented on CASSANDRA-7713:
---

Without any recommendation on CASSANDRA-7927, since I haven't had a good look 
at it, we don't want to close this ticket without 
fixing/replacing/removing/something CommitFailurePolicy_stop leaving this 
non-writable dir behind - not only do following tests fail, things like `git 
clean -df` don't work, if this test dorks.  :)

I'd be happy if the solution was to pass in a code trigger of some sort than 
actually setting the dir read-only - whatever tests functionality properly.

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures

2014-10-16 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174588#comment-14174588
 ] 

Joshua McKenzie commented on CASSANDRA-7713:


This came up while I was working on CASSANDRA-7927.  The _stop test was always 
returning true regardless of whether or not the stop case on 
CommitLog.handleCommitError actually stopped anything or not, and having that 
dangling directory w/out write permissions can cause problems (as evidenced 
here).

I've updated that unit test in that ticket to actually pass the stop error in 
rather than trying to simulate it to confirm the Gossiper is shutting down; if 
the general public is happy with that solution we can close this as duplicate / 
not a problem after that ticket.

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8131) Short-circuited query results from collection index query


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-8131:
--
Assignee: Sylvain Lebresne
  Labels: collections cql3 cqlsh query queryparser triaged  (was: 
collections cql3 cqlsh query queryparser)

> Short-circuited query results from collection index query
> -
>
> Key: CASSANDRA-8131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8131
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian Wheezy, Oracle JDK, Cassandra 2.1
>Reporter: Catalin Alexandru Zamfir
>Assignee: Sylvain Lebresne
>  Labels: collections, cql3, cqlsh, query, queryparser, triaged
> Fix For: 2.1.0
>
>
> After watching Jonathan's 2014 summit video, I wanted to give collection 
> indexes a try as they seem to be a fit for a "search by key/values" usage 
> pattern we have in our setup. Doing some test queries that I expect users 
> would do against the table, a short-circuit behavior came up:
> Here's the whole transcript:
> {noformat}
> CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, datavars 
> set);
> CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
> CREATE INDEX by_sets_datavars ON by_sets (datavars);
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, {'b'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, {'d'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, {'f'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, {'z'});
> SELECT * FROM by_sets;
>  id | datakeys | datavars
> +--+--
>   1 |{'a'} |{'b'}
>   2 |{'c'} |{'d'}
>   4 |{'a'} |{'z'}
>   3 |{'e'} |{'f'}
> {noformat}
> We then tried this query which short-circuited:
> {noformat}
> SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys CONTAINS 'c';
>  id | datakeys | datavars
> +--+--
>   1 |{'a'} |{'b'}
>   4 |{'a'} |{'z'}
> (2 rows)
> {noformat}
> Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND 
> datakeys CONTAINS 'c' we only got the first.
> Doing the same, but with CONTAINS 'c' first, ignores the second AND.
> {noformat}
> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys CONTAINS 'a' ;
>  id | datakeys | datavars
> +--+--
>   2 |{'c'} |{'d'}
> (1 rows)
> {noformat}
> Also, on a side-note, I have two indexes on both datakeys and datavars. But 
> when trying to run a query such as:
> {noformat}
> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
> code=2200 [Invalid query] message="Cannot execute this query as it might 
> involve data filtering and thus may have unpredictable performance. 
> If you want to execute this query despite the performance unpredictability, 
> use ALLOW FILTERING"
> {noformat}
> The second column, after AND (even if I inverse the order) requires an "allow 
> filtering" clause yet the column is indexed an an in-memory "join" of the 
> primary keys of these sets on the coordinator could build up the result.
> Could anyone explain the short-circuit behavior?
> And the requirement for "allow-filtering" on a secondly indexed column?
> If they're not bugs but intended they should be documented better, at least 
> their limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures

2014-10-16 Thread Bogdan Kanivets (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174571#comment-14174571
 ] 

Bogdan Kanivets commented on CASSANDRA-7713:


Another way is to not use sleepUninterruptibly. It looks like this prevents 
'finally' block to execute in time. That was my original solution, but then 
I've noticed that there is more general problem with the test - when you remove 
'logFile.setWritable(false)' it still passes.

I'll try to look into it this weekend to get a patch (have been busy lately), 
but feel free to reassign

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8131) Short-circuited query results from collection index query


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174567#comment-14174567
 ] 

Michael Shuler commented on CASSANDRA-8131:
---

On the cassandra-2.1 branch, commit 440824c, I get some different behavior:
{noformat}
mshuler@hana:~$ cqlsh 
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.1-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy' , 
'replication_factor': 1 };
cqlsh> USE test ;
cqlsh:test> CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, 
datavars set);
cqlsh:test> CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
cqlsh:test> CREATE INDEX by_sets_datavars ON by_sets (datavars);
cqlsh:test> INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, 
{'b'});
cqlsh:test> INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, 
{'d'});
cqlsh:test> INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, 
{'f'});
cqlsh:test> INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, 
{'z'});
cqlsh:test> SELECT * FROM by_sets;

 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

(4 rows)
cqlsh:test> SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c';

 id | datakeys | datavars
+--+--

(0 rows)
cqlsh:test> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys 
CONTAINS 'a' ;

 id | datakeys | datavars
+--+--

(0 rows)
cqlsh:test> SELECT * FROM by_sets WHERE datakeys CONTAINS 'a';

 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)
cqlsh:test> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c';

 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)
{noformat}

> Short-circuited query results from collection index query
> -
>
> Key: CASSANDRA-8131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8131
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian Wheezy, Oracle JDK, Cassandra 2.1
>Reporter: Catalin Alexandru Zamfir
>  Labels: collections, cql3, cqlsh, query, queryparser
> Fix For: 2.1.0
>
>
> After watching Jonathan's 2014 summit video, I wanted to give collection 
> indexes a try as they seem to be a fit for a "search by key/values" usage 
> pattern we have in our setup. Doing some test queries that I expect users 
> would do against the table, a short-circuit behavior came up:
> Here's the whole transcript:
> {noformat}
> CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, datavars 
> set);
> CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
> CREATE INDEX by_sets_datavars ON by_sets (datavars);
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, {'b'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, {'d'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, {'f'});
> INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, {'z'});
> SELECT * FROM by_sets;
>  id | datakeys | datavars
> +--+--
>   1 |{'a'} |{'b'}
>   2 |{'c'} |{'d'}
>   4 |{'a'} |{'z'}
>   3 |{'e'} |{'f'}
> {noformat}
> We then tried this query which short-circuited:
> {noformat}
> SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys CONTAINS 'c';
>  id | datakeys | datavars
> +--+--
>   1 |{'a'} |{'b'}
>   4 |{'a'} |{'z'}
> (2 rows)
> {noformat}
> Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND 
> datakeys CONTAINS 'c' we only got the first.
> Doing the same, but with CONTAINS 'c' first, ignores the second AND.
> {noformat}
> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys CONTAINS 'a' ;
>  id | datakeys | datavars
> +--+--
>   2 |{'c'} |{'d'}
> (1 rows)
> {noformat}
> Also, on a side-note, I have two indexes on both datakeys and datavars. But 
> when trying to run a query such as:
> {noformat}
> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
> code=2200 [Invalid query] message="Cannot execute this query as it might 
> involve data filtering and thus may have unpredictable performance. 
> If you want to execute this query despite the performance unpredictability, 
> use ALLOW FILTERING"
> {noformat}
> The second column, after AND (even if I inverse the order) requires an "allow 
> filtering" clause yet the column is indexed an an in-memory "join" of the 
> primary keys of these sets on the coordinator could build up the result.
> Could anyon

[jira] [Resolved] (CASSANDRA-7446) Batchlog should be streamed to a different node on decom

2014-10-16 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-7446.
--
   Resolution: Fixed
Fix Version/s: (was: 2.1.2)
   2.1.1
   2.0.11

Committed, thanks

> Batchlog should be streamed to a different node on decom
> 
>
> Key: CASSANDRA-7446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7446
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Branimir Lambov
> Fix For: 2.0.11, 2.1.1
>
> Attachments: 7446-2.0.txt
>
>
> Just like we stream hints on decom, we should also stream the contents of the 
> batchlog - even though we do replicate the batch to at least two nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt
src/java/org/apache/cassandra/db/BatchlogManager.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/440824c1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/440824c1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/440824c1

Branch: refs/heads/trunk
Commit: 440824c1a60a344bc3e8a5ad35ae2fac879bd61d
Parents: 014d328 e916dff
Author: Aleksey Yeschenko 
Authored: Fri Oct 17 03:40:17 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:40:17 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 64 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/440824c1/CHANGES.txt
--
diff --cc CHANGES.txt
index b40e14b,73aaab0..d7a8904
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,89 -1,5 +1,90 @@@
 -2.0.11:
 +2.1.1
 + * Fix IllegalArgumentException when a list of IN values containing tuples
 +   is passed as a single arg to a prepared statement with the v1 or v2
 +   protocol (CASSANDRA-8062)
 + * Fix ClassCastException in DISTINCT query on static columns with
 +   query paging (CASSANDRA-8108)
 + * Fix NPE on null nested UDT inside a set (CASSANDRA-8105)
 + * Fix exception when querying secondary index on set items or map keys
 +   when some clustering columns are specified (CASSANDRA-8073)
 + * Send proper error response when there is an error during native
 +   protocol message decode (CASSANDRA-8118)
 + * Gossip should ignore generation numbers too far in the future 
(CASSANDRA-8113)
 + * Fix NPE when creating a table with frozen sets, lists (CASSANDRA-8104)
 + * Fix high memory use due to tracking reads on incrementally opened sstable
 +   readers (CASSANDRA-8066)
 + * Fix EXECUTE request with skipMetadata=false returning no metadata
 +   (CASSANDRA-8054)
 + * Allow concurrent use of CQLBulkOutputFormat (CASSANDRA-7776)
 + * Shutdown JVM on OOM (CASSANDRA-7507)
 + * Upgrade netty version and enable epoll event loop (CASSANDRA-7761)
 + * Don't duplicate sstables smaller than split size when using
 +   the sstablesplitter tool (CASSANDRA-7616)
 + * Avoid re-parsing already prepared statements (CASSANDRA-7923)
 + * Fix some Thrift slice deletions and updates of COMPACT STORAGE
 +   tables with some clustering columns omitted (CASSANDRA-7990)
 + * Fix filtering for CONTAINS on sets (CASSANDRA-8033)
 + * Properly track added size (CASSANDRA-7239)
 + * Allow compilation in java 8 (CASSANDRA-7208)
 + * Fix Assertion error on RangeTombstoneList diff (CASSANDRA-8013)
 + * Release references to overlapping sstables during compaction 
(CASSANDRA-7819)
 + * Send notification when opening compaction results early (CASSANDRA-8034)
 + * Make native server start block until properly bound (CASSANDRA-7885)
 + * (cqlsh) Fix IPv6 support (CASSANDRA-7988)
 + * Ignore fat clients when checking for endpoint collision (CASSANDRA-7939)
 + * Make sstablerepairedset take a list of files (CASSANDRA-7995)
 + * (cqlsh) Tab completeion for indexes on map keys (CASSANDRA-7972)
 + * (cqlsh) Fix UDT field selection in select clause (CASSANDRA-7891)
 + * Fix resource leak in event of corrupt sstable
 + * (cqlsh) Add command line option for cqlshrc file path (CASSANDRA-7131)
 + * Provide visibility into prepared statements churn (CASSANDRA-7921, 
CASSANDRA-7930)
 + * Invalidate prepared statements when their keyspace or table is
 +   dropped (CASSANDRA-7566)
 + * cassandra-stress: fix support for NetworkTopologyStrategy (CASSANDRA-7945)
 + * Fix saving caches when a table is dropped (CASSANDRA-7784)
 + * Add better error checking of new stress profile (CASSANDRA-7716)
 + * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom 
(CASSANDRA-7934)
 + * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069)
 + * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658)
 + * GCInspector more closely tracks GC; cassandra-stress and nodetool report 
it (CASSANDRA-7916)
 + * nodetool won't output bogus ownership info without a keyspace 
(CASSANDRA-7173)
 + * Add human readable option to nodetool commands (CASSANDRA-5433)
 + * Don't try to set repairedAt on old sstables (CASSANDRA-7913)
 + * Add metrics for tracking PreparedStatement use (CASSANDRA-7719)
 + * (cqlsh) tab-completion for triggers (CASSANDRA-7824)
 + * (cqlsh) Support for query paging (CASSANDRA-7514)
 + * (cqlsh) Show progress of COPY operations (CASSANDRA-7789)
 + * Add syntax to remove mul

[3/3] git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ca6beb6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ca6beb6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ca6beb6

Branch: refs/heads/trunk
Commit: 0ca6beb6823bcf6f31c69454da98c1f53cd88780
Parents: 543fbc3 440824c
Author: Aleksey Yeschenko 
Authored: Fri Oct 17 03:40:46 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:40:46 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 63 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ca6beb6/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ca6beb6/src/java/org/apache/cassandra/service/StorageService.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ca6beb6/test/unit/org/apache/cassandra/db/BatchlogManagerTest.java
--

[1/3] git commit: Force batchlog replay before decommissioning a node

Repository: cassandra
Updated Branches:
  refs/heads/trunk 543fbc374 -> 0ca6beb68


Force batchlog replay before decommissioning a node

patch by Branimir Lambov; reviewed by Aleksey Yeschenko for
CASSANDRA-7446


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e916dff8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e916dff8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e916dff8

Branch: refs/heads/trunk
Commit: e916dff8ba032d878ad4435eb7175c6a56f79ef4
Parents: 67db1bf
Author: Branimir Lambov 
Authored: Fri Oct 17 03:18:37 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:18:37 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 63 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cd4b6bb..73aaab0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.11:
+ * Force batchlog replay before decommissioning a node (CASSANDRA-7446)
  * Fix hint replay with many accumulated expired hints (CASSANDRA-6998)
  * Fix duplicate results in DISTINCT queries on static columns with query
paging (CASSANDRA-8108)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/src/java/org/apache/cassandra/db/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java 
b/src/java/org/apache/cassandra/db/BatchlogManager.java
index b92c217..48f4c3c 100644
--- a/src/java/org/apache/cassandra/db/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/db/BatchlogManager.java
@@ -25,7 +25,6 @@ import java.net.InetAddress;
 import java.nio.ByteBuffer;
 import java.util.*;
 import java.util.concurrent.*;
-import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.concurrent.atomic.AtomicLong;
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
@@ -69,8 +68,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 public static final BatchlogManager instance = new BatchlogManager();
 
 private final AtomicLong totalBatchesReplayed = new AtomicLong();
-private final AtomicBoolean isReplaying = new AtomicBoolean();
 
+// Single-thread executor service for scheduling and serializing log 
replay.
 public static final ScheduledExecutorService batchlogTasks = new 
DebuggableScheduledThreadPoolExecutor("BatchlogTasks");
 
 public void start()
@@ -108,6 +107,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 public void forceBatchlogReplay()
 {
+startBatchlogReplay();
+}
+
+public Future startBatchlogReplay()
+{
 Runnable runnable = new WrappedRunnable()
 {
 public void runMayThrow() throws ExecutionException, 
InterruptedException
@@ -115,7 +119,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 replayAllFailedBatches();
 }
 };
-batchlogTasks.execute(runnable);
+// If a replay is already in progress this request will be executed 
after it completes.
+return batchlogTasks.submit(runnable);
 }
 
 public static RowMutation getBatchlogMutationFor(Collection 
mutations, UUID uuid)
@@ -156,12 +161,8 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 return ByteBuffer.wrap(bos.toByteArray());
 }
 
-@VisibleForTesting
-void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
+private void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
 {
-if (!isReplaying.compareAndSet(false, true))
-return;
-
 logger.debug("Started replayAllFailedBatches");
 
 // rate limit is in bytes per second. Uses Double.MAX_VALUE if 
disabled (set to 0 in cassandra.yaml).
@@ -169,34 +170,27 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 int throttleInKB = DatabaseDescriptor.getBatchlogReplayThrottleInKB() 
/ StorageService.instance.getTokenMetadata().getAllEndpoints().size();
 RateLimiter rateLimiter = RateLimiter.create(throttleInKB == 0 ? 
Double.MAX_VALUE : throttleInKB * 1024);
 
-try
-{
-UntypedResultSet page = process("SELECT id, data, written_at, 
version FROM %s.%s LIMIT %d",
-Keyspace.SYSTEM_KS,
-SystemKey

[1/2] git commit: Force batchlog replay before decommissioning a node

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 014d328f4 -> 440824c1a


Force batchlog replay before decommissioning a node

patch by Branimir Lambov; reviewed by Aleksey Yeschenko for
CASSANDRA-7446


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e916dff8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e916dff8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e916dff8

Branch: refs/heads/cassandra-2.1
Commit: e916dff8ba032d878ad4435eb7175c6a56f79ef4
Parents: 67db1bf
Author: Branimir Lambov 
Authored: Fri Oct 17 03:18:37 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:18:37 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 63 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cd4b6bb..73aaab0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.11:
+ * Force batchlog replay before decommissioning a node (CASSANDRA-7446)
  * Fix hint replay with many accumulated expired hints (CASSANDRA-6998)
  * Fix duplicate results in DISTINCT queries on static columns with query
paging (CASSANDRA-8108)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/src/java/org/apache/cassandra/db/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java 
b/src/java/org/apache/cassandra/db/BatchlogManager.java
index b92c217..48f4c3c 100644
--- a/src/java/org/apache/cassandra/db/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/db/BatchlogManager.java
@@ -25,7 +25,6 @@ import java.net.InetAddress;
 import java.nio.ByteBuffer;
 import java.util.*;
 import java.util.concurrent.*;
-import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.concurrent.atomic.AtomicLong;
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
@@ -69,8 +68,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 public static final BatchlogManager instance = new BatchlogManager();
 
 private final AtomicLong totalBatchesReplayed = new AtomicLong();
-private final AtomicBoolean isReplaying = new AtomicBoolean();
 
+// Single-thread executor service for scheduling and serializing log 
replay.
 public static final ScheduledExecutorService batchlogTasks = new 
DebuggableScheduledThreadPoolExecutor("BatchlogTasks");
 
 public void start()
@@ -108,6 +107,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 public void forceBatchlogReplay()
 {
+startBatchlogReplay();
+}
+
+public Future startBatchlogReplay()
+{
 Runnable runnable = new WrappedRunnable()
 {
 public void runMayThrow() throws ExecutionException, 
InterruptedException
@@ -115,7 +119,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 replayAllFailedBatches();
 }
 };
-batchlogTasks.execute(runnable);
+// If a replay is already in progress this request will be executed 
after it completes.
+return batchlogTasks.submit(runnable);
 }
 
 public static RowMutation getBatchlogMutationFor(Collection 
mutations, UUID uuid)
@@ -156,12 +161,8 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 return ByteBuffer.wrap(bos.toByteArray());
 }
 
-@VisibleForTesting
-void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
+private void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
 {
-if (!isReplaying.compareAndSet(false, true))
-return;
-
 logger.debug("Started replayAllFailedBatches");
 
 // rate limit is in bytes per second. Uses Double.MAX_VALUE if 
disabled (set to 0 in cassandra.yaml).
@@ -169,34 +170,27 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 int throttleInKB = DatabaseDescriptor.getBatchlogReplayThrottleInKB() 
/ StorageService.instance.getTokenMetadata().getAllEndpoints().size();
 RateLimiter rateLimiter = RateLimiter.create(throttleInKB == 0 ? 
Double.MAX_VALUE : throttleInKB * 1024);
 
-try
-{
-UntypedResultSet page = process("SELECT id, data, written_at, 
version FROM %s.%s LIMIT %d",
-Keyspace.SYSTEM_KS,
-

[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt
src/java/org/apache/cassandra/db/BatchlogManager.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/440824c1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/440824c1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/440824c1

Branch: refs/heads/cassandra-2.1
Commit: 440824c1a60a344bc3e8a5ad35ae2fac879bd61d
Parents: 014d328 e916dff
Author: Aleksey Yeschenko 
Authored: Fri Oct 17 03:40:17 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:40:17 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 64 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/440824c1/CHANGES.txt
--
diff --cc CHANGES.txt
index b40e14b,73aaab0..d7a8904
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,89 -1,5 +1,90 @@@
 -2.0.11:
 +2.1.1
 + * Fix IllegalArgumentException when a list of IN values containing tuples
 +   is passed as a single arg to a prepared statement with the v1 or v2
 +   protocol (CASSANDRA-8062)
 + * Fix ClassCastException in DISTINCT query on static columns with
 +   query paging (CASSANDRA-8108)
 + * Fix NPE on null nested UDT inside a set (CASSANDRA-8105)
 + * Fix exception when querying secondary index on set items or map keys
 +   when some clustering columns are specified (CASSANDRA-8073)
 + * Send proper error response when there is an error during native
 +   protocol message decode (CASSANDRA-8118)
 + * Gossip should ignore generation numbers too far in the future 
(CASSANDRA-8113)
 + * Fix NPE when creating a table with frozen sets, lists (CASSANDRA-8104)
 + * Fix high memory use due to tracking reads on incrementally opened sstable
 +   readers (CASSANDRA-8066)
 + * Fix EXECUTE request with skipMetadata=false returning no metadata
 +   (CASSANDRA-8054)
 + * Allow concurrent use of CQLBulkOutputFormat (CASSANDRA-7776)
 + * Shutdown JVM on OOM (CASSANDRA-7507)
 + * Upgrade netty version and enable epoll event loop (CASSANDRA-7761)
 + * Don't duplicate sstables smaller than split size when using
 +   the sstablesplitter tool (CASSANDRA-7616)
 + * Avoid re-parsing already prepared statements (CASSANDRA-7923)
 + * Fix some Thrift slice deletions and updates of COMPACT STORAGE
 +   tables with some clustering columns omitted (CASSANDRA-7990)
 + * Fix filtering for CONTAINS on sets (CASSANDRA-8033)
 + * Properly track added size (CASSANDRA-7239)
 + * Allow compilation in java 8 (CASSANDRA-7208)
 + * Fix Assertion error on RangeTombstoneList diff (CASSANDRA-8013)
 + * Release references to overlapping sstables during compaction 
(CASSANDRA-7819)
 + * Send notification when opening compaction results early (CASSANDRA-8034)
 + * Make native server start block until properly bound (CASSANDRA-7885)
 + * (cqlsh) Fix IPv6 support (CASSANDRA-7988)
 + * Ignore fat clients when checking for endpoint collision (CASSANDRA-7939)
 + * Make sstablerepairedset take a list of files (CASSANDRA-7995)
 + * (cqlsh) Tab completeion for indexes on map keys (CASSANDRA-7972)
 + * (cqlsh) Fix UDT field selection in select clause (CASSANDRA-7891)
 + * Fix resource leak in event of corrupt sstable
 + * (cqlsh) Add command line option for cqlshrc file path (CASSANDRA-7131)
 + * Provide visibility into prepared statements churn (CASSANDRA-7921, 
CASSANDRA-7930)
 + * Invalidate prepared statements when their keyspace or table is
 +   dropped (CASSANDRA-7566)
 + * cassandra-stress: fix support for NetworkTopologyStrategy (CASSANDRA-7945)
 + * Fix saving caches when a table is dropped (CASSANDRA-7784)
 + * Add better error checking of new stress profile (CASSANDRA-7716)
 + * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom 
(CASSANDRA-7934)
 + * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069)
 + * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658)
 + * GCInspector more closely tracks GC; cassandra-stress and nodetool report 
it (CASSANDRA-7916)
 + * nodetool won't output bogus ownership info without a keyspace 
(CASSANDRA-7173)
 + * Add human readable option to nodetool commands (CASSANDRA-5433)
 + * Don't try to set repairedAt on old sstables (CASSANDRA-7913)
 + * Add metrics for tracking PreparedStatement use (CASSANDRA-7719)
 + * (cqlsh) tab-completion for triggers (CASSANDRA-7824)
 + * (cqlsh) Support for query paging (CASSANDRA-7514)
 + * (cqlsh) Show progress of COPY operations (CASSANDRA-7789)
 + * Add syntax to re

git commit: Force batchlog replay before decommissioning a node

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 67db1bf27 -> e916dff8b


Force batchlog replay before decommissioning a node

patch by Branimir Lambov; reviewed by Aleksey Yeschenko for
CASSANDRA-7446


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e916dff8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e916dff8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e916dff8

Branch: refs/heads/cassandra-2.0
Commit: e916dff8ba032d878ad4435eb7175c6a56f79ef4
Parents: 67db1bf
Author: Branimir Lambov 
Authored: Fri Oct 17 03:18:37 2014 +0300
Committer: Aleksey Yeschenko 
Committed: Fri Oct 17 03:18:37 2014 +0300

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/BatchlogManager.java| 63 ++--
 .../cassandra/service/StorageService.java   | 25 ++--
 .../cassandra/db/BatchlogManagerTest.java   |  8 +--
 4 files changed, 57 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cd4b6bb..73aaab0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.11:
+ * Force batchlog replay before decommissioning a node (CASSANDRA-7446)
  * Fix hint replay with many accumulated expired hints (CASSANDRA-6998)
  * Fix duplicate results in DISTINCT queries on static columns with query
paging (CASSANDRA-8108)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e916dff8/src/java/org/apache/cassandra/db/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java 
b/src/java/org/apache/cassandra/db/BatchlogManager.java
index b92c217..48f4c3c 100644
--- a/src/java/org/apache/cassandra/db/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/db/BatchlogManager.java
@@ -25,7 +25,6 @@ import java.net.InetAddress;
 import java.nio.ByteBuffer;
 import java.util.*;
 import java.util.concurrent.*;
-import java.util.concurrent.atomic.AtomicBoolean;
 import java.util.concurrent.atomic.AtomicLong;
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
@@ -69,8 +68,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 public static final BatchlogManager instance = new BatchlogManager();
 
 private final AtomicLong totalBatchesReplayed = new AtomicLong();
-private final AtomicBoolean isReplaying = new AtomicBoolean();
 
+// Single-thread executor service for scheduling and serializing log 
replay.
 public static final ScheduledExecutorService batchlogTasks = new 
DebuggableScheduledThreadPoolExecutor("BatchlogTasks");
 
 public void start()
@@ -108,6 +107,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 public void forceBatchlogReplay()
 {
+startBatchlogReplay();
+}
+
+public Future startBatchlogReplay()
+{
 Runnable runnable = new WrappedRunnable()
 {
 public void runMayThrow() throws ExecutionException, 
InterruptedException
@@ -115,7 +119,8 @@ public class BatchlogManager implements BatchlogManagerMBean
 replayAllFailedBatches();
 }
 };
-batchlogTasks.execute(runnable);
+// If a replay is already in progress this request will be executed 
after it completes.
+return batchlogTasks.submit(runnable);
 }
 
 public static RowMutation getBatchlogMutationFor(Collection 
mutations, UUID uuid)
@@ -156,12 +161,8 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 return ByteBuffer.wrap(bos.toByteArray());
 }
 
-@VisibleForTesting
-void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
+private void replayAllFailedBatches() throws ExecutionException, 
InterruptedException
 {
-if (!isReplaying.compareAndSet(false, true))
-return;
-
 logger.debug("Started replayAllFailedBatches");
 
 // rate limit is in bytes per second. Uses Double.MAX_VALUE if 
disabled (set to 0 in cassandra.yaml).
@@ -169,34 +170,27 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 int throttleInKB = DatabaseDescriptor.getBatchlogReplayThrottleInKB() 
/ StorageService.instance.getTokenMetadata().getAllEndpoints().size();
 RateLimiter rateLimiter = RateLimiter.create(throttleInKB == 0 ? 
Double.MAX_VALUE : throttleInKB * 1024);
 
-try
-{
-UntypedResultSet page = process("SELECT id, data, written_at, 
version FROM %s.%s LIMIT %d",
-Keyspace.SYSTEM_KS,
-

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174435#comment-14174435
 ] 

Michael Shuler commented on CASSANDRA-7713:
---

I'm half-way thinking that the way to handle this timeout problem is to simply 
add a test that is run immediately after CommitLogTest that simply resets 
{{commitDir.setWritable(true);}}

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-7713:
--
Reproduced In: 2.1.0, 2.0.10
Since Version: 2.0.6
   Tester: Michael Shuler

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174392#comment-14174392
 ] 

Michael Shuler commented on CASSANDRA-7713:
---

[~dankan] have you had any luck with this?

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174391#comment-14174391
 ] 

Michael Shuler commented on CASSANDRA-7713:
---

An @AfterClass didn't help, either..
{noformat}
diff --git a/test/unit/org/apache/cassandra/db/CommitLogTest.java 
b/test/unit/org/apache/cassandra/db/CommitLogTest.java
index 1be29a6..19faace 100644
--- a/test/unit/org/apache/cassandra/db/CommitLogTest.java
+++ b/test/unit/org/apache/cassandra/db/CommitLogTest.java
@@ -29,6 +29,7 @@ import java.util.zip.Checksum;
 
 import com.google.common.util.concurrent.Uninterruptibles;
 import org.junit.Assert;
+import org.junit.AfterClass;
 import org.junit.Test;
 
 import org.apache.cassandra.SchemaLoader;
@@ -48,6 +49,15 @@ import static 
org.apache.cassandra.utils.ByteBufferUtil.bytes;
 
 public class CommitLogTest extends SchemaLoader
 {
+@AfterClass
+public static void resetCommitLogDir()
+{
+// junit timeout leaves commitDir non-writable - CASSANDRA-7713
+File commitDir = new File(DatabaseDescriptor.getCommitLogLocation());
+commitDir.setWritable(true);
+System.out.println("reset commitlogdir: " + commitDir);
+}
+
 @Test
 public void testRecoveryWithEmptyLog() throws Exception
 {
{noformat}

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8028) Unable to compute when histogram overflowed

2014-10-16 Thread Cameron Hatfield (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174385#comment-14174385
 ] 

Cameron Hatfield commented on CASSANDRA-8028:
-

Looks like this doesn't fully resolve the issue. According to running 
sstablemetadata on a 2.1.0 sstable file, as well as MetadataCollector.java: 
https://github.com/apache/cassandra/blob/8d8fed52242c34b477d0384ba1d1ce3978efbbe8/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java#L59

the sstable metadata persisted for these histograms are actually stored with a 
larger number of buckets then 90. The issue seems to be both the nodetool, 
https://github.com/apache/cassandra/blob/810c2d5fe64333c0bcfe0b2ed3ea2c8f6aaf89b7/src/java/org/apache/cassandra/tools/NodeTool.java#L892,
 as well as ColumnFamilyMetrics 
https://github.com/apache/cassandra/blob/ed1f39480606c95ff6595aad0aad9c1af7460f74/src/java/org/apache/cassandra/metrics/ColumnFamilyMetrics.java#L220
 have a hardcoded value of 90. If that was raised, then we would be able to 
display the non-overflowed histograms stored in the metadata.

Example output from sstablemetadata (notice that number of rows is 115 and 150, 
not 90 and 90) :
[cameron@cass-db01 ]$ sstablemetadata --ka-33-Data.db
SSTable: ./--ka-33
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1413408134518716
Maximum timestamp: 1413410874004562
SSTable max local deletion time: 2147483647
Compression ratio: 0.2157516194949938
Estimated droppable tombstones: 0.026257982293749805
SSTable Level: 0
Repaired at: 0
ReplayPosition(segmentId=1413409259260, position=15051162)
Estimated tombstone drop times:%n
1413408139:  1647
1413408151:  2451
1413408165:  3151
1413408180:  3400
1413408199:  3027
1413408214:  2769
1413408228:  2064
1413408244:  1779
1413408261:  3817
1413408280:  7265
1413408302:  1911
1413408319:  1512
1413408337:  1582
1413408354:  1712
1413408375:  1577
1413408393:  2507
1413408411:  1410
1413408431:   761
1413408447:   507
1413408466:  2593
1413408483:  3840
1413408503:  1557
1413408523:   819
1413409632:   742
1413409646:   641
1413409662:   473
1413409684:   704
1413409700:   762
1413409716:   601
1413409728:   125
1413409744:  1190
1413409763:  1181
1413409783:  1768
1413409800:  1730
1413409820:  1326
1413409837:  1273
1413409856:  1299
1413409871:  2663
1413409887:  2197
1413409901:  1776
1413409917:   871
1413409934:  1449
1413409952:  1700
1413409969:  1301
1413409984:  2100
1413410002:  2103
1413410021:  1208
1413410039:   923
1413410052:  1425
1413410068:  1796
1413410081:  2263
1413410095:  2664
1413410110:  3019
1413410128:  2823
1413410146:  3801
1413410160:  3864
1413410175:  3252
1413410188:  8337
1413410204:  9375
1413410219:  6125
1413410235:  7954
1413410254: 11019
1413410271: 12703
1413410287: 12274
1413410303: 12199
1413410317: 10751
1413410330: 11369
1413410343: 10552
1413410355:  8157
1413410369:  8776
1413410384:  7504
1413410400:  7312
1413410418:  7472
1413410434:  7032
1413410448:  6338
1413410465:  5335
1413410484:  6427
1413410504:  7897
1413410523:  8515
1413410539:  4886
1413410557:  4847
1413410576:  4987
1413410591:  7630
1413410611:  8553
1413410628: 12157
1413410645: 12740
1413410663: 13756
1413410679: 19249
1413410695: 19374
1413410713: 15390
1413410732: 13493
1413410746: 13793
1413410760: 16937
1413410775: 19841
1413410791: 16595
1413410808: 19050
1413410823: 18450
1413410840: 22497
1413410861: 34027
1413410872:16
Count   Row SizeCell Count
1  0 0
2  0 0
3  0 0
4  016
5  0 0
6  0 0
7  0 0
8  035
10 0 0
12 017
14 0 0
17 029
20 014
24 021
29 012
35 013
42 040
50 019
60 030
72 032
86

[jira] [Commented] (CASSANDRA-8133) BulkLoader does not use rpc_endpoints

2014-10-16 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174378#comment-14174378
 ] 

Brandon Williams commented on CASSANDRA-8133:
-

bq. sstableloader appears to stream to listen_addresses instead of rpc_addresses

It needs to, though.  It's streaming sstables straight into Cassandra, not 
sending rpc writes.  listen_address can't bind multiple interfaces, so it's the 
only possible choice.

> BulkLoader does not use rpc_endpoints
> -
>
> Key: CASSANDRA-8133
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8133
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Umair Mufti
>  Labels: lhf
> Attachments: 0001-BulkLoader-with-rpc_endpoints.patch
>
>
> sstableloader appears to stream to listen_addresses instead of rpc_addresses. 
> This causes sstableloader to fail when streaming to nodes which bind to 
> multiple interfaces.
>  
> The problem seems to stem from BulkLoader populating the endpointToRanges map 
> with incorrect values. BulkLoader always uses the TokenRange's endpoints 
> list. 
>  
> Attached is a patch which uses the rpc_endpoints list of the TokenRange if it 
> exists. Otherwise, it uses the standard endpoints list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8133) BulkLoader does not use rpc_endpoints

2014-10-16 Thread Umair Mufti (JIRA)

Umair Mufti created CASSANDRA-8133:
--

 Summary: BulkLoader does not use rpc_endpoints
 Key: CASSANDRA-8133
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8133
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Umair Mufti
 Attachments: 0001-BulkLoader-with-rpc_endpoints.patch

sstableloader appears to stream to listen_addresses instead of rpc_addresses. 
This causes sstableloader to fail when streaming to nodes which bind to 
multiple interfaces.
 
The problem seems to stem from BulkLoader populating the endpointToRanges map 
with incorrect values. BulkLoader always uses the TokenRange's endpoints list. 
 
Attached is a patch which uses the rpc_endpoints list of the TokenRange if it 
exists. Otherwise, it uses the standard endpoints list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7713) CommitLogTest failure causes cascading unit test failures


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174345#comment-14174345
 ] 

Michael Shuler commented on CASSANDRA-7713:
---

I tried adding an @After block to the test, which appears to not be run when 
timeout occurs..  :(
{noformat}
diff --git a/test/unit/org/apache/cassandra/db/CommitLogTest.java 
b/test/unit/org/apache/cassandra/db/CommitLogTest.java
index 1be29a6..f30d527 100644
--- a/test/unit/org/apache/cassandra/db/CommitLogTest.java
+++ b/test/unit/org/apache/cassandra/db/CommitLogTest.java
@@ -30,6 +30,7 @@ import java.util.zip.Checksum;
 import com.google.common.util.concurrent.Uninterruptibles;
 import org.junit.Assert;
 import org.junit.Test;
+import org.junit.After;
 
 import org.apache.cassandra.SchemaLoader;
 import org.apache.cassandra.Util;
@@ -318,4 +319,12 @@ public class CommitLogTest extends SchemaLoader
 row = command.getRow(notDurableKs);
 Assert.assertEquals(null, row.cf);
 }
+
+@After
+public void resetCommitLogDir()
+{
+File commitDir = new File(DatabaseDescriptor.getCommitLogLocation());
+commitDir.setWritable(true);
+System.out.println("reset commitlogdir: " + commitDir);
+}
 }
{noformat}

(System.out.println was just for debugging, and this is output during a 
successful run)

> CommitLogTest failure causes cascading unit test failures
> -
>
> Key: CASSANDRA-7713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7713
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Bogdan Kanivets
> Fix For: 2.0.11
>
> Attachments: CommitLogTest.system.log.txt
>
>
> When CommitLogTest.testCommitFailurePolicy_stop fails or times out, 
> {{commitDir.setWritable(true)}} is never reached, so the 
> build/test/cassandra/commitlog directory is left without write permissions, 
> causing cascading failure of all subsequent tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-10-16 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174327#comment-14174327
 ] 

Brandon Williams commented on CASSANDRA-8132:
-

I don't understand.  When you replace a node it has the same host id so the 
existing hints are replayed from wherever they exist.  Where are you proposing 
we stream them and to what end?

> Save or stream hints to a safe place in node replacement
> 
>
> Key: CASSANDRA-8132
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Minh Do
>Assignee: Minh Do
> Fix For: 2.1.1
>
>
> Often, we need to replace a node with a new instance in the cloud environment 
> where we have all nodes are still alive. To be safe without losing data, we 
> usually make sure all hints are gone before we do this operation.
> Replacement means we just want to shutdown C* process on a node and bring up 
> another instance to take over that node's token.
> However, if a node has a lot of stored hints, HintedHandofManager seems very 
> slow to play the hints.  In our case, we tried to replace a node and had to 
> wait for several days.
> Since this is not a decommission, I am proposing that we have the same 
> hints-streaming mechanism as in the decommission code.  Furthermore, there 
> needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

Minh Do created CASSANDRA-8132:
--

 Summary: Save or stream hints to a safe place in node replacement
 Key: CASSANDRA-8132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.1.1


Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node has a lot of stored hints, HintedHandofManager seems very 
slow to play the hints.  In our case, we tried to replace a node and had to 
wait for several days.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8131) Short-circuited query results from collection index query


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catalin Alexandru Zamfir updated CASSANDRA-8131:

Description: 
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:
{noformat}
CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, {'b'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, {'d'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, {'f'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, {'z'});
SELECT * FROM by_sets;

 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

{noformat}
We then tried this query which short-circuited:
{noformat}
SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys CONTAINS 'c';

 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)
{noformat}
Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.
{noformat}
SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys CONTAINS 'a' ;

 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)
{noformat}
Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:
{noformat}
select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. 
If you want to execute this query despite the performance unpredictability, use 
ALLOW FILTERING"
{noformat}
The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the coordinator could build up the result.

Could anyone explain the short-circuit behavior?
And the requirement for "allow-filtering" on a secondly indexed column?

If they're not bugs but intended they should be documented better, at least 
their limitations.

  was:
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:
{noformat}
CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, {'b'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, {'d'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, {'f'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, {'z'});
SELECT * FROM by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

{noformat}
We then tried this query which short-circuited:
{noformat}
SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)

{noformat}
Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.
{noformat}
#> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)

{noformat}
Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:
{noformat}
#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. 
If you want to execute this query despite the performance unpredictability, use 
ALLOW FILTERING"
{noformat}
The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column

[jira] [Updated] (CASSANDRA-8131) Short-circuited query results from collection index query


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catalin Alexandru Zamfir updated CASSANDRA-8131:

Description: 
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:
{noformat}
CREATE TABLE by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
INSERT INTO by_sets (id, datakeys, datavars) VALUES (1, {'a'}, {'b'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (2, {'c'}, {'d'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (3, {'e'}, {'f'});
INSERT INTO by_sets (id, datakeys, datavars) VALUES (4, {'a'}, {'z'});
SELECT * FROM by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

{noformat}
We then tried this query which short-circuited:
{noformat}
SELECT * FROM by_sets WHERE datakeys CONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)

{noformat}
Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.
{noformat}
#> SELECT * FROM by_sets WHERE datakeys CONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)

{noformat}
Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:
{noformat}
#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. 
If you want to execute this query despite the performance unpredictability, use 
ALLOW FILTERING"
{noformat}
The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the coordinator could build up the result.

Could anyone explain the short-circuit behavior?
And the requirement for "allow-filtering" on a secondly indexed column?

If they're not bugs but intended they should be documented better, at least 
their limitations.

  was:
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:
{noformat}
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
 1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

{noformat}
We then tried this query which short-circuited:
{noformat}
select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)

{noformat}
Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.
{noformat}
#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)

{noformat}
Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:
{noformat}
#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"
{noformat}
The second column, after AND (even if I inverse the order) requires an "allow

[jira] [Updated] (CASSANDRA-8131) Short-circuited query results from collection index query


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catalin Alexandru Zamfir updated CASSANDRA-8131:

Description: 
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:
{noformat}
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
 1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}

{noformat}
We then tried this query which short-circuited:
{noformat}
select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)

{noformat}
Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.
{noformat}
#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)

{noformat}
Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:
{noformat}
#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"
{noformat}
The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the coordinator could build up the result.

Could anyone explain the short-circuit behavior?
And the requirement for "allow-filtering" on a secondly indexed column?

If they're not bugs but intended they should be documented better, at least 
their limitations.

  was:
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:cqlsh:
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}


We then tried this query which short-circuited:

select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)


Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.

#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)


Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:

#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"

The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the

[jira] [Commented] (CASSANDRA-7623) Altering keyspace truncates DESCRIBE output until you reconnect.

2014-10-16 Thread Tyler Hobbs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174294#comment-14174294
 ] 

Tyler Hobbs commented on CASSANDRA-7623:


[PYTHON-173|https://datastax-oss.atlassian.net/browse/PYTHON-173] is the cause. 
 I'm guessing this is not related to CASSANDRA-8012.

> Altering keyspace truncates DESCRIBE output until you reconnect.
> 
>
> Key: CASSANDRA-7623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7623
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Tyler Hobbs
>Priority: Minor
>  Labels: cqlsh
> Fix For: 2.1.2
>
>
> Run DESCRIBE on a keyspace:
> {code}
> cqlsh> DESCRIBE KEYSPACE system_traces ;
> CREATE KEYSPACE system_traces WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '2'}  AND durable_writes = true;
> CREATE TABLE system_traces.events (
> session_id uuid,
> event_id timeuuid,
> activity text,
> source inet,
> source_elapsed int,
> thread text,
> PRIMARY KEY (session_id, event_id)
> ) WITH CLUSTERING ORDER BY (event_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 0
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 360
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> CREATE TABLE system_traces.sessions (
> session_id uuid PRIMARY KEY,
> coordinator inet,
> duration int,
> parameters map,
> request text,
> started_at timestamp
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = 'traced sessions'
> AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 0
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 360
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> Alter it and run DESCRIBE again: 
> {code}
> cqlsh> ALTER KEYSPACE system_traces WITH durable_writes = false;
> cqlsh> DESCRIBE KEYSPACE system_traces ;
> CREATE KEYSPACE system_traces WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '2'}  AND durable_writes = false;
> {code}
> You can issue the DESCRIBE command multiple times and get the same output. 
> You have to disconnect and reconnect to get the table definition output to 
> show again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8131) Short-circuited query results from collection index query


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Catalin Alexandru Zamfir updated CASSANDRA-8131:

Description: 
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:cqlsh:
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}


We then tried this query which short-circuited:

select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)


Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.

#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)


Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:

#> select * from by_sets WHERE datakeys CONTAINS 'a' AND datavars CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"

The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the coordinator could build up the result.

Could anyone explain the short-circuit behavior?
And the requirement for "allow-filtering" on a secondly indexed column?

If they're not bugs but intended they should be documented better, at least 
their limitations.

  was:
After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:cqlsh:
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}


We then tried this query which short-circuited:

select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)


Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.

#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)


Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:

cqlsh:etsv2> select * from by_sets WHERE datakeys cONTAINS 'a' AND datavars 
CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"

The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the co-ordonator could build up

[jira] [Created] (CASSANDRA-8131) Short-circuited query results from collection index query

Catalin Alexandru Zamfir created CASSANDRA-8131:
---

 Summary: Short-circuited query results from collection index query
 Key: CASSANDRA-8131
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8131
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Wheezy, Oracle JDK, Cassandra 2.1
Reporter: Catalin Alexandru Zamfir
 Fix For: 2.1.0


After watching Jonathan's 2014 summit video, I wanted to give collection 
indexes a try as they seem to be a fit for a "search by key/values" usage 
pattern we have in our setup. Doing some test queries that I expect users would 
do against the table, a short-circuit behavior came up:

Here's the whole transcript:cqlsh:
create table by_sets (id int PRIMARY KEY, datakeys set, datavars 
set);
CREATE INDEX by_sets_datakeys ON by_sets (datakeys);
CREATE INDEX by_sets_datavars ON by_sets (datavars);
insert into by_sets (id, datakeys, datavars) values (1, {'a'}, {'b'});
insert into by_sets (id, datakeys, datavars) values (2, {'c'}, {'d'});
insert into by_sets (id, datakeys, datavars) values (3, {'e'}, {'f'});
insert into by_sets (id, datakeys, datavars) values (4, {'a'}, {'z'});
select * from by_sets;


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  2 |{'c'} |{'d'}
  4 |{'a'} |{'z'}
  3 |{'e'} |{'f'}


We then tried this query which short-circuited:

select * from by_sets WHERE datakeys cONTAINS 'a' AND datakeys CONTAINS 'c';


 id | datakeys | datavars
+--+--
  1 |{'a'} |{'b'}
  4 |{'a'} |{'z'}

(2 rows)


Instead of receveing 3 rows, which match the datakeys CONTAINS 'a' AND datakeys 
CONTAINS 'c' we only got the first.

Doing the same, but with CONTAINS 'c' first, ignores the second AND.

#> select * from by_sets WHERE datakeys cONTAINS 'c' AND datakeys CONTAINS 'a' ;


 id | datakeys | datavars
+--+--
  2 |{'c'} |{'d'}

(1 rows)


Also, on a side-note, I have two indexes on both datakeys and datavars. But 
when trying to run a query such as:

cqlsh:etsv2> select * from by_sets WHERE datakeys cONTAINS 'a' AND datavars 
CONTAINS 'z';
code=2200 [Invalid query] message="Cannot execute this query as it might 
involve data filtering and thus may have unpredictable performance. If you want 
to execute this query despite the performance unpredictability, use ALLOW 
FILTERING"

The second column, after AND (even if I inverse the order) requires an "allow 
filtering" clause yet the column is indexed an an in-memory "join" of the 
primary keys of these sets on the co-ordonator could build up the result.

Could anyone explain the short-circuit behavior?
And the requirement for "allow-filtering" on a secondly indexed column?

If they're not bugs but intended they should be documented better, at least 
their limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7712) temporary files need to be cleaned by unit tests

2014-10-16 Thread Rajanarayanan Thottuvaikkatumana (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174262#comment-14174262
 ] 

Michael Shuler edited comment on CASSANDRA-7712 at 10/16/14 9:27 PM:
-

cassandra-2.0 ant test = BUILD SUCCESSFUL - I do still have a good number of 
unit test related /tmp/ data dirs and files:

{noformat}
mshuler@hana:~$ diff utest-prerun-tmp-ls.txt utest-postrun-tmp-ls.txt | grep '>'
> total 13124
> drwxrwxrwt 43 rootroot1089536 Oct 16 16:25 .
> drwxr-xr-x  2 mshuler mshuler4096 Oct 16 15:59 1413493171580-0
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 15:59 1413493171699-0
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 15:59 1413493193793-0
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace11151002594308015529Counter1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace11599882418925729405Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:04 
> Keyspace11629141255351575538Counter1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace11707160731940439205Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace12479573729045096761Counter1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13188027533114189105Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13537085006164740065Super4
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13552747847152420085Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13765397797228357038Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13850859261864741561ValuesWithQuotes
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace13958837569092782804Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace14244544246771294094Super4
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:04 
> Keyspace14896282513423138739Counter1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace16033616767223568836Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace17379306356946300363Standard1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:04 
> Keyspace17392859057498785197Counter1
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace17734442112141636779UUIDKeys
> drwxr-xr-x  3 mshuler mshuler4096 Oct 16 16:05 
> Keyspace18327561120308012372Standard1
> drwxr-xr-x  2 mshuler mshuler4096 Oct 16 16:06 hsperfdata_mshuler
> -rw-r--r--  1 mshuler mshuler  97 Oct 16 16:05 
> CFWithColumnNameEqualToDefaultKeyAlias1352052296559090082.json
> -rw-r--r--  1 mshuler mshuler 202 Oct 16 16:05 
> CFWithDeletionInfo6821352692246723039.json
> -rw-r--r--  1 mshuler mshuler 164 Oct 16 16:05 
> Counter16133384777473446457.json
> -rw-r--r--  1 mshuler mshuler  73 Oct 16 16:05 
> Standard13497963750024141166.json
> -rw-r--r--  1 mshuler mshuler 212 Oct 16 16:05 
> Standard16498036137584406114.json
> -rw-r--r--  1 mshuler mshuler  18 Oct 16 16:05 
> Standard17653206436081170668.txt
> -rw-r--r--  1 mshuler mshuler  72 Oct 16 16:05 
> ValuesWithQuotes5950169076917583577.json
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:04 
> cassandra3334695142197245525unittest
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:04 
> cassandra501609141610521005unittest
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:04 
> cassandra5466512109259350063unittest
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:04 
> cassandra6680141275408996567unittest
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:04 
> cassandra8802165868016413651unittest
> -rw-r--r--  1 mshuler mshuler2038 Oct 16 16:04 
> ks-cf-ib-1-CompressionInfo.db
> -rw-r--r--  1 mshuler mshuler7388 Oct 16 16:04 ks-cf-ib-1-Data.db
> -rw-r--r--  1 mshuler mshuler  163840 Oct 16 16:00 
> lengthtest2172098199453929105bin
> -rw-r--r--  1 mshuler mshuler   65536 Oct 16 16:00 
> readtest5611485117038557908bin
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:00 
> set_length_during_read_mode615047735510069936bin
> -rw-r--r--  1 mshuler mshuler   0 Oct 16 16:00 
> set_negative_length7812195015686662874bin
> -rwxr-xr-x  1 mshuler mshuler   48432 Oct 16 15:48 
> snappy-1.0.5-libsnappyjava.so
{noformat}


was (Author: mshuler):
cassandra-2.0 ant test = BUILD SUCCESSFUL

lgtm!

> temporary files need to be cleaned by unit tests
> 
>
> Key: CASSANDRA-7712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7712
> Project: Cassandra
>  Issue Type: Test
>  Components: Tests
>Reporter: Michael Shuler
>Priority: Minor
>  Labels: bootcamp, lhf
> Fix For: 2.0.11
>
> Attachments: 7712-hung-CliTest_system.log.gz, 7712-v2.txt, 
> 7712-v3.txt, CASSANDRA-7712_apache_cassandra_2.0.txt
>
>
> There ar

[jira] [Updated] (CASSANDRA-8109) Avoid constant boxing in ColumnStats.{Min/Max}Tracker


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajanarayanan Thottuvaikkatumana updated CASSANDRA-8109:

Attachment: cassandra-trunk-8109.txt

Please find attached the patch for CASSANDRA-8109 on the trunk

> Avoid constant boxing in ColumnStats.{Min/Max}Tracker
> -
>
> Key: CASSANDRA-8109
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8109
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Rajanarayanan Thottuvaikkatumana
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0
>
> Attachments: cassandra-trunk-8109.txt
>
>
> We use the {{ColumnStats.MinTracker}} and {{ColumnStats.MaxTracker}} to track 
> timestamps and deletion times in sstable. Those classes are generics but we 
> really ever use them for longs and integers. The consequence is that every 
> call to their {{update}} method (called for every cell during sstable write) 
> box it's argument (since we don't store the cell timestamps and deletion time 
> boxed). That feels like a waste that is easy to fix: we could just make those 
> work on longs only for instance and convert back to int at the end when 
> that's what we need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7712) temporary files need to be cleaned by unit tests