[jira] [Commented] (CASSANDRA-10068) Batchlog replay fails with exception after a node is decommissioned

2015-08-17 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700737#comment-14700737
 ] 

Joel Knighton commented on CASSANDRA-10068:
---

[~krummas] Instructions on setting up the environment are available at 
https://github.com/riptano/jepsen/tree/cassandra/cassandra.

Specifically, the test under consideration can be run as 
{code}
lein with-profile +trunk test :only 
cassandra.mv-test/mv-crash-subset-decommission
{code}

That said, I understand the environment setup is a bit laborious, and I'm still 
working on reproducing this with the provided dtest.

> Batchlog replay fails with exception after a node is decommissioned
> ---
>
> Key: CASSANDRA-10068
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10068
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Marcus Eriksson
> Fix For: 3.0 beta 2
>
> Attachments: n1.log, n2.log, n3.log, n4.log, n5.log
>
>
> This issue is reproducible through a Jepsen test of materialized views that 
> crashes and decommissions nodes throughout the test.
> At the conclusion of the test, a batchlog replay is initiated through 
> nodetool and hits the following assertion due to a missing host ID: 
> https://github.com/apache/cassandra/blob/3413e557b95d9448b0311954e9b4f53eaf4758cd/src/java/org/apache/cassandra/service/StorageProxy.java#L1197
> A nodetool status on the node with failed batchlog replay shows the following 
> entry for the decommissioned node:
> DN  10.0.0.5  ?  256  ?   null
>   rack1
> On the unaffected nodes, there is no entry for the decommissioned node as 
> expected.
> There are occasional hits of the same assertions for logs in other nodes; it 
> looks like the issue might occasionally resolve itself, but one node seems to 
> have the errant null entry indefinitely.
> In logs for the nodes, this possibly unrelated exception also appears:
> java.lang.RuntimeException: Trying to get the view natural endpoint on a 
> non-data replica
>   at 
> org.apache.cassandra.db.view.MaterializedViewUtils.getViewNaturalEndpoint(MaterializedViewUtils.java:91)
>  ~[apache-cassandra-3.0.0-alpha1-SNAPSHOT.jar:3.0.0-alpha1-SNAPSHOT]
> I have a running cluster with the issue on my machine; it is also repeatable.
> Nothing stands out in the logs of the decommissioned node (n4) for me. The 
> logs of each node in the cluster are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9434) If a node loses schema_columns SSTables it could delete all secondary indexes from the schema

2015-08-17 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700697#comment-14700697
 ] 

Richard Low commented on CASSANDRA-9434:


Thanks Aleksey. So it sounds like we should close this as behaves correctly?

> If a node loses schema_columns SSTables it could delete all secondary indexes 
> from the schema
> -
>
> Key: CASSANDRA-9434
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9434
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Low
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.x
>
>
> It is possible that a single bad node can delete all secondary indexes if it 
> restarts and cannot read its schema_columns SSTables. Here's a reproduction:
> * Create a 2 node cluster (we saw it on 2.0.11)
> * Create the schema:
> {code}
> create keyspace myks with replication = {'class':'SimpleStrategy', 
> 'replication_factor':1};
> use myks;
> create table mytable (a text, b text, c text, PRIMARY KEY (a, b) );
> create index myindex on mytable(b);
> {code}
> NB index must be on clustering column to repro
> * Kill one node
> * Wipe its commitlog and system/schema_columns sstables.
> * Start it again
> * Run on this node
> select index_name from system.schema_columns where keyspace_name = 'myks' and 
> columnfamily_name = 'mytable' and column_name = 'b';
> and you'll see the index is null.
> * Run 'describe schema' on the other node. Sometimes it will not show the 
> index, but you might need to bounce for it to disappear.
> I think the culprit is SystemKeyspace.copyAllAliasesToColumnsProper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700684#comment-14700684
 ] 

Stefania commented on CASSANDRA-8630:
-

Thanks for your analysis.

I repeated the tests, 3 identical runs each time, albeit with a smaller data 
set. They still indicate it is the uncompressed case where something has gone 
wrong, not the compressed case. And more specifically I traced the slowness to 
mmap disk access.

Here are the results, because I am on a 64-bit machine 
{{disk_access_mode=auto}} resolves to {{mmap}} (although I am not sure at which 
version this behavior started so it may not be true for all versions). In the 
'uncomp-std' test I forced the disk access mode to standard.

||Version||Run 1||Run 2||Run 3||Rounded AVG||
|8630 comp|17.91|18.31|17.94|18|
|8630 uncomp|28.06|28.95|28.02|28|
|8630 uncomp-std|19.31|18.09|18.9|19|
|TRUNK comp|17.95|17.64|17.72|18|
|TRUNK uncomp|20.81|20.01|18.81|20|
|2.2 comp|19.95|20.33|19.97|20|
|2.2 uncomp|19.14|19.18|20.1|19|
|2.1 comp|21.61|20.43|20.43|21|
|2.1 uncomp|20.4|19.67|19.71|20|
|2.0 comp|18.8|19.42|19.66|19|
|2.0 uncomp|19.48|19.55|19.68|20|

Notes:
* Reduced data to 1M entries, which corresponds to approximately 220 MB of 
data. This allowed me to keep the machine _more or less_ idle during the tests.
* All tests done with Java 8 update 51 except for 2.0 which was done with Java 
7 update 80.
* Tests performed on a 64-bit linux laptop with SSD
* Compaction strategy was the default strategy used by the stress tool: 
SizedTieredCompactionStrategy

Next I need to understand why mmap is so slow, I think I must have broken 
something when I moved the segments to the RAR however.

bq. I usually set the file read and write and contention thresholds to one 
millisecond.

What parameters do you use to achieve this?

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10086) Add a "CLEAR" cqlsh command to clear the console

2015-08-17 Thread Paul O'Fallon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700652#comment-14700652
 ] 

Paul O'Fallon commented on CASSANDRA-10086:
---

Thanks!  Yes, I can certainly reorder the changes to cqlsh.  I'll also see what 
I can do regarding the tests.  I'll update the issue later in the week.

> Add a "CLEAR" cqlsh command to clear the console
> 
>
> Key: CASSANDRA-10086
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10086
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paul O'Fallon
>Priority: Trivial
>  Labels: cqlsh, doc-impacting
> Attachments: 10086.txt
>
>
> It would be very helpful to have a "CLEAR" command to clear the cqlsh 
> console.  I learned (after researching a patch for this) that lowercase 
> CTRL+L will clear the screen, but having a discrete command would make that 
> more obvious.  To match the expectations of Windows users, an alias to "CLS" 
> would be nice as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10113) Undroppable messages can be dropped if message queue gets large

2015-08-17 Thread Yuki Morishita (JIRA)
Yuki Morishita created CASSANDRA-10113:
--

 Summary: Undroppable messages can be dropped if message queue gets 
large
 Key: CASSANDRA-10113
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10113
 Project: Cassandra
  Issue Type: Bug
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Minor


When outgoing messages are queued, OutboundTcpConnection checks the size of 
backlog, and [if it gets more than 1024, it drops expired messages silently 
from the 
backlog|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L150].

{{expireMessages()}} just checks message's timeout, which can be 
{{request_timeout_in_ms}} (10 sec default) for non- read/write message, and 
does not check if its droppable.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10112) Apply disk_failure_policy to transaction logs

2015-08-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700562#comment-14700562
 ] 

Stefania commented on CASSANDRA-10112:
--

If we stashed corrupt sstable files regardless of transactions, then we would 
also fulfill the requirements of CASSANDRA-9812.

> Apply disk_failure_policy to transaction logs
> -
>
> Key: CASSANDRA-10112
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10112
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefania
>Assignee: Stefania
>
> Transaction logs were introduced by CASSANDRA-7066 and are read during 
> start-up. In case of file system errors, such as disk corruption, we 
> currently log a panic error and leave the sstable files and transaction logs 
> as they are; this is to avoid rolling back a transaction (i.e. deleting 
> files) by mistake.
> We should instead look at the {{disk_failure_policy}} and refuse to start 
> unless the failure policy is {{ignore}}. 
> We should also consider stashing files that cannot be read during startup, 
> either transaction logs or sstables, by moving them to a dedicated 
> sub-folder. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10112) Apply disk_failure_policy to transaction logs

2015-08-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700537#comment-14700537
 ] 

Stefania commented on CASSANDRA-10112:
--

I think we should stash files when the {{disk_failure_policy}} is {{ignore}}, 
or we could add a new policy for this. I'm not sure for the case when we refuse 
to start though. Perhaps in this case we should add this functionality to the 
offline sstable utility tool and let the operator either clean-up or stash.

> Apply disk_failure_policy to transaction logs
> -
>
> Key: CASSANDRA-10112
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10112
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefania
>Assignee: Stefania
>
> Transaction logs were introduced by CASSANDRA-7066 and are read during 
> start-up. In case of file system errors, such as disk corruption, we 
> currently log a panic error and leave the sstable files and transaction logs 
> as they are; this is to avoid rolling back a transaction (i.e. deleting 
> files) by mistake.
> We should instead look at the {{disk_failure_policy}} and refuse to start 
> unless the failure policy is {{ignore}}. 
> We should also consider stashing files that cannot be read during startup, 
> either transaction logs or sstables, by moving them to a dedicated 
> sub-folder. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-08-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700533#comment-14700533
 ] 

Stefania commented on CASSANDRA-7066:
-

I agree on stashing files with {{ignore}}, not sure for the case when we refuse 
to start though. Perhaps in this case we should add this functionality to the 
offline sstable utility tool and let the operator either clean-up or stash. 
I've opened CASSANDRA-10112, let's move the discussion there. 

[~benedict] could you commit the fix for two COVERITY defects? Commit 
[here|https://github.com/stef1927/cassandra/commit/c94d8a12dc43cd4705f1aa8cf384fb8c3290a5f9].

{code}
** CID 1316515:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
/src/java/org/apache/cassandra/db/lifecycle/TransactionLog.java: 363 in 
org.apache.cassandra.db.lifecycle.TransactionLog$TransactionFile.readRecord(java.lang.String,
 boolean)()



*** CID 1316515:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
/src/java/org/apache/cassandra/db/lifecycle/TransactionLog.java: 363 in 
org.apache.cassandra.db.lifecycle.TransactionLog$TransactionFile.readRecord(java.lang.String,
 boolean)()
357 if (!matcher.matches() || matcher.groupCount() != 2)
358 {
359 handleReadRecordError(String.format("cannot parse line 
\"%s\"", line), isLast);
360 return Record.make(line, isLast);
361 }
362
>>> CID 1316515:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
>>> Found reliance on default encoding: String.getBytes().
363 byte[] bytes = matcher.group(1).getBytes();
364 checksum.update(bytes, 0, bytes.length);
365
366 if (checksum.getValue() != Long.valueOf(matcher.group(2)))
367 handleReadRecordError(String.format("invalid line 
checksum %s for \"%s\"", matcher.group(2), line), isLast);
368

** CID 1316514:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
/src/java/org/apache/cassandra/db/lifecycle/TransactionLog.java: 231 in 
org.apache.cassandra.db.lifecycle.TransactionLog$Record.getBytes()()



*** CID 1316514:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
/src/java/org/apache/cassandra/db/lifecycle/TransactionLog.java: 231 in 
org.apache.cassandra.db.lifecycle.TransactionLog$Record.getBytes()()
225 {
226 return String.format("%s:[%s,%d,%d]", type.toString(), 
relativeFilePath, updateTime, numFiles);
227 }
228
229 public byte[] getBytes()
230 {
>>> CID 1316514:  FindBugs: Internationalization  (FB.DM_DEFAULT_ENCODING)
>>> Found reliance on default encoding: String.getBytes().
231 return record.getBytes();
232 }
233
234 public boolean verify(String parentFolder, boolean 
lastRecordIsCorrupt)
235 {
236 if (type != RecordType.REMOVE)
{code}

> Simplify (and unify) cleanup of compaction leftovers
> 
>
> Key: CASSANDRA-7066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>Priority: Minor
>  Labels: benedict-to-commit, compaction
> Fix For: 3.0 alpha 1
>
> Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10112) Apply disk_failure_policy to transaction logs

2015-08-17 Thread Stefania (JIRA)
Stefania created CASSANDRA-10112:


 Summary: Apply disk_failure_policy to transaction logs
 Key: CASSANDRA-10112
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10112
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stefania
Assignee: Stefania


Transaction logs were introduced by CASSANDRA-7066 and are read during 
start-up. In case of file system errors, such as disk corruption, we currently 
log a panic error and leave the sstable files and transaction logs as they are; 
this is to avoid rolling back a transaction (i.e. deleting files) by mistake.

We should instead look at the {{disk_failure_policy}} and refuse to start 
unless the failure policy is {{ignore}}. 

We should also consider stashing files that cannot be read during startup, 
either transaction logs or sstables, by moving them to a dedicated sub-folder. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10109) Windows dtest 3.0: ttl_test.py failures

2015-08-17 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700480#comment-14700480
 ] 

Stefania commented on CASSANDRA-10109:
--

This is bad news. As far as I understand the documentation, this means that on 
Windows we cannot list files in a directory atomically, third paragraph 
[here|http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#newDirectoryStream(java.nio.file.Path)].

So we could list some sstable temporary files but not the txn log file, later 
they get deleted along with their txn log file by a racing thread, and if we 
fail to list the txn log file we classify these sstable files incorrectly as 
final files. However, these files shouldn't exist any longer since the txn log 
is deleted last, so this would result in NoSuchFileExceptions when trying to 
read the files.

I think we should check that all final files exist before returning them and 
repeat the process in case some files no longer exist. This should only be done 
when we don't have atomic listing.

[~benedict] do you think this would be enough or do you see other potential 
races?


> Windows dtest 3.0: ttl_test.py failures
> ---
>
> Key: CASSANDRA-10109
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10109
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>  Labels: Windows
> Fix For: 3.0.x
>
>
> ttl_test.py:TestTTL.update_column_ttl_with_default_ttl_test2
> ttl_test.py:TestTTL.update_multiple_columns_ttl_test
> ttl_test.py:TestTTL.update_single_column_ttl_test
> Errors locally are different than CI from yesterday. Yesterday on CI we have 
> timeouts and general node hangs. Today on all 3 tests when run locally I see:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
> raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
> errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
> 16:53:43,120 NoSpamLogger.java:97 - This platform does not support atomic 
> directory streams (SecureDirectoryStream); race conditions when loading 
> sstable files could occurr']
> {noformat}
> This traces back to the commit for CASSANDRA-7066 today by [~Stefania] and 
> [~benedict].  Stefania - care to take this ticket and also look further into 
> whether or not we're going to have issues with 7066 on Windows? That error 
> message certainly *sounds* like it's not a good thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-10109) Windows dtest 3.0: ttl_test.py failures

2015-08-17 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania reassigned CASSANDRA-10109:


Assignee: Stefania

> Windows dtest 3.0: ttl_test.py failures
> ---
>
> Key: CASSANDRA-10109
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10109
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>Assignee: Stefania
>  Labels: Windows
> Fix For: 3.0.x
>
>
> ttl_test.py:TestTTL.update_column_ttl_with_default_ttl_test2
> ttl_test.py:TestTTL.update_multiple_columns_ttl_test
> ttl_test.py:TestTTL.update_single_column_ttl_test
> Errors locally are different than CI from yesterday. Yesterday on CI we have 
> timeouts and general node hangs. Today on all 3 tests when run locally I see:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
> raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
> errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
> 16:53:43,120 NoSpamLogger.java:97 - This platform does not support atomic 
> directory streams (SecureDirectoryStream); race conditions when loading 
> sstable files could occurr']
> {noformat}
> This traces back to the commit for CASSANDRA-7066 today by [~Stefania] and 
> [~benedict].  Stefania - care to take this ticket and also look further into 
> whether or not we're going to have issues with 7066 on Windows? That error 
> message certainly *sounds* like it's not a good thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700433#comment-14700433
 ] 

Ariel Weisberg edited comment on CASSANDRA-8630 at 8/17/15 11:43 PM:
-

The difference in performance seems too be big to be explained by what is 
covered here. Maybe a native call like compression/decompression got slower due 
to additional copying? Flight recorder happily ignores native code.

Flight recording of 8630 with compression, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.db.rows|1,577|25.403|
|org.apache.cassandra.utils|1,498|24.13|
|org.apache.cassandra.utils.btree|670|10.793|
|com.googlecode.concurrentlinkedhashmap|598|9.633|
|java.util|585|9.423|
|org.apache.cassandra.io.sstable|430|6.927|
|org.apache.cassandra.db.partitions|183|2.948|
|org.apache.cassandra.cache|162|2.61|
|org.apache.cassandra.io.util|139|2.239|
|org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$$Lambda$93|77|1.24|
|org.apache.cassandra.db|74|1.192|

Flight recording trunk, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.utils|1,771|26.732|
|org.apache.cassandra.db.rows|1,599|24.136|
|com.googlecode.concurrentlinkedhashmap|631|9.525|
|java.util|590|8.906|
|org.apache.cassandra.utils.btree|565|8.528|
|org.apache.cassandra.io.sstable|438|6.611|
|org.apache.cassandra.io.util|330|4.981|
|org.apache.cassandra.db.partitions|124|1.872|
|org.apache.cassandra.cache|121|1.826|
|org.apache.cassandra.io.sstable.format.big|105|1.585|
|org.apache.cassandra.db|102|1.54|


was (Author: aweisberg):
Flight recording of 8630 with compression, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.db.rows|1,577|25.403|
|org.apache.cassandra.utils|1,498|24.13|
|org.apache.cassandra.utils.btree|670|10.793|
|com.googlecode.concurrentlinkedhashmap|598|9.633|
|java.util|585|9.423|
|org.apache.cassandra.io.sstable|430|6.927|
|org.apache.cassandra.db.partitions|183|2.948|
|org.apache.cassandra.cache|162|2.61|
|org.apache.cassandra.io.util|139|2.239|
|org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$$Lambda$93|77|1.24|
|org.apache.cassandra.db|74|1.192|

Flight recording trunk, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.utils|1,771|26.732|
|org.apache.cassandra.db.rows|1,599|24.136|
|com.googlecode.concurrentlinkedhashmap|631|9.525|
|java.util|590|8.906|
|org.apache.cassandra.utils.btree|565|8.528|
|org.apache.cassandra.io.sstable|438|6.611|
|org.apache.cassandra.io.util|330|4.981|
|org.apache.cassandra.db.partitions|124|1.872|
|org.apache.cassandra.cache|121|1.826|
|org.apache.cassandra.io.sstable.format.big|105|1.585|
|org.apache.cassandra.db|102|1.54|

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700433#comment-14700433
 ] 

Ariel Weisberg commented on CASSANDRA-8630:
---

Flight recording of 8630 with compression, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.db.rows|1,577|25.403|
|org.apache.cassandra.utils|1,498|24.13|
|org.apache.cassandra.utils.btree|670|10.793|
|com.googlecode.concurrentlinkedhashmap|598|9.633|
|java.util|585|9.423|
|org.apache.cassandra.io.sstable|430|6.927|
|org.apache.cassandra.db.partitions|183|2.948|
|org.apache.cassandra.cache|162|2.61|
|org.apache.cassandra.io.util|139|2.239|
|org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$$Lambda$93|77|1.24|
|org.apache.cassandra.db|74|1.192|

Flight recording trunk, hot packages
||Package|Sample Count|Percentage(%)||
|org.apache.cassandra.utils|1,771|26.732|
|org.apache.cassandra.db.rows|1,599|24.136|
|com.googlecode.concurrentlinkedhashmap|631|9.525|
|java.util|590|8.906|
|org.apache.cassandra.utils.btree|565|8.528|
|org.apache.cassandra.io.sstable|438|6.611|
|org.apache.cassandra.io.util|330|4.981|
|org.apache.cassandra.db.partitions|124|1.872|
|org.apache.cassandra.cache|121|1.826|
|org.apache.cassandra.io.sstable.format.big|105|1.585|
|org.apache.cassandra.db|102|1.54|

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8671) Give compaction strategy more control over where sstables are created, including for flushing and streaming.

2015-08-17 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700430#comment-14700430
 ] 

Blake Eggleston commented on CASSANDRA-8671:


Here's the branch: https://github.com/bdeggleston/cassandra/tree/8671-2

and the last test runs: 
http://cassci.datastax.com/job/bdeggleston-8671-2-testall/2/
http://cassci.datastax.com/job/bdeggleston-8671-2-dtest/2/

Doesn't look like there's anything failing that isn't already failing on 
cassandra-3.0

> Give compaction strategy more control over where sstables are created, 
> including for flushing and streaming.
> 
>
> Key: CASSANDRA-8671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8671
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.x
>
> Attachments: 
> 0001-C8671-creating-sstable-writers-for-flush-and-stream-.patch, 
> 8671-giving-compaction-strategies-more-control-over.txt
>
>
> This would enable routing different partitions to different disks based on 
> some user defined parameters.
> My initial take on how to do this would be to make an interface from 
> SSTableWriter, and have a table's compaction strategy do all SSTableWriter 
> instantiation. Compaction strategies could then implement their own 
> SSTableWriter implementations (which basically wrap one or more normal 
> sstablewriters) for compaction, flushing, and streaming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700401#comment-14700401
 ] 

Ariel Weisberg commented on CASSANDRA-8630:
---

||Version|Time 1|Time 2|Time 3|
|8630 uncompressed|197|204|198|
|8630 compressed|263|262|261|
|3.x uncompressed|199|198|198|
|3.x compressed|200|198|198|

My intuition is that the compressed case has something bad happening, and that 
there is no impact from the changes in the uncompressed case. That kind of 
suggests the time/bottleneck is elsewhere. I am looking at the flight 
recordings now.

Did you measure on OS X or Linux? FYI I usually set the file read and write and 
contention thresholds to one millisecond. Doesn't seem to impact performance, 
but does provide a clearer picture.

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9738) Migrate key-cache to be fully off-heap

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700387#comment-14700387
 ] 

Ariel Weisberg commented on CASSANDRA-9738:
---

Good stuff. Coverage of the OHC key cache provider looks good.

* How are the 2i paths tested?
* The null case in makeVal isn't tested, maybe not that interesting
* SerializationHeader forKeyCache is racy and can result in an undersize array 
clobbering a properly sized one. But... it doesn't retrieve the value it sets 
so odds are eventually it will work out to be the longer one. It works it's 
just intentionally racy.
* In CacheService does that comment about singleton weigher even make sense 
anymore?
* NIODataInputStream has a derived class DataInputBuffer that exposes the 
constructor you made public.
* The string encoding and decoding helpers you wrote seem like they should be 
factored out somewhere else, maybe ByteBufferUtil? Also you don't specify a 
string encoding and there may be some issues with serialized size of non-latin 
characters lurking as well.
* An enhancement we can file for later is to replace those strings with vints 
that reference a map of possible table names. For persistence definitely fully 
qualify, but in memory we can store more entries that way.

> Migrate key-cache to be fully off-heap
> --
>
> Key: CASSANDRA-9738
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9738
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
> Fix For: 3.0 beta 2
>
>
> Key cache still uses a concurrent map on-heap. This could go to off-heap and 
> feels doable now after CASSANDRA-8099.
> Evaluation should be done in advance based on a POC to prove that pure 
> off-heap counter cache buys a performance and/or gc-pressure improvement.
> In theory, elimination of on-heap management of the map should buy us some 
> benefit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10111) reconnecting snitch can bypass cluster name check

2015-08-17 Thread Chris Burroughs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Burroughs updated CASSANDRA-10111:

Summary: reconnecting snitch can bypass cluster name check  (was: 
reconnecting snitch can bypass name check)

> reconnecting snitch can bypass cluster name check
> -
>
> Key: CASSANDRA-10111
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10111
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.0.x
>Reporter: Chris Burroughs
>  Labels: gossip
>
> Setup:
>  * Two clusters: A & B
>  * Both are two DC cluster
>  * Both use GossipingPropertyFileSnitch with different 
> listen_address/broadcast_address
> A new node was added to cluster A with a broadcast_address of an existing 
> node in cluster B (due to an out of data DNS entry).  Cluster B  added all of 
> the nodes from cluster A, somehow bypassing the cluster name mismatch check 
> for this nodes.  The first reference to cluster A nodes in cluster B logs is 
> when then were added:
> {noformat}
>  INFO [GossipStage:1] 2015-08-17 15:08:33,858 Gossiper.java (line 983) Node 
> /8.37.70.168 is now part of the cluster
> {noformat}
> Cluster B nodes then tried to gossip to cluster A nodes, but cluster A kept 
> them out with 'ClusterName mismatch'.  Cluster B however tried to send to 
> send reads/writes to cluster A and general mayhem ensued.
> Obviously this is a Bad (TM) config that Should Not Be Done.  However, since 
> the consequence of crazy merged clusters are really bad (the reason there is 
> the name mismatch check in the first place) I think the hole is reasonable to 
> plug.  I'm not sure exactly what the code path is that skips the check in 
> GossipDigestSynVerbHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10111) reconnecting snitch can bypass name check

2015-08-17 Thread Chris Burroughs (JIRA)
Chris Burroughs created CASSANDRA-10111:
---

 Summary: reconnecting snitch can bypass name check
 Key: CASSANDRA-10111
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10111
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 2.0.x
Reporter: Chris Burroughs


Setup:
 * Two clusters: A & B
 * Both are two DC cluster
 * Both use GossipingPropertyFileSnitch with different 
listen_address/broadcast_address

A new node was added to cluster A with a broadcast_address of an existing node 
in cluster B (due to an out of data DNS entry).  Cluster B  added all of the 
nodes from cluster A, somehow bypassing the cluster name mismatch check for 
this nodes.  The first reference to cluster A nodes in cluster B logs is when 
then were added:
{noformat}
 INFO [GossipStage:1] 2015-08-17 15:08:33,858 Gossiper.java (line 983) Node 
/8.37.70.168 is now part of the cluster
{noformat}

Cluster B nodes then tried to gossip to cluster A nodes, but cluster A kept 
them out with 'ClusterName mismatch'.  Cluster B however tried to send to send 
reads/writes to cluster A and general mayhem ensued.

Obviously this is a Bad (TM) config that Should Not Be Done.  However, since 
the consequence of crazy merged clusters are really bad (the reason there is 
the name mismatch check in the first place) I think the hole is reasonable to 
plug.  I'm not sure exactly what the code path is that skips the check in 
GossipDigestSynVerbHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9434) If a node loses schema_columns SSTables it could delete all secondary indexes from the schema

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700301#comment-14700301
 ] 

Aleksey Yeschenko commented on CASSANDRA-9434:
--

So, the good news is that this issue will not happen in 2.1, 2.2, or 3.0. In 
those we assume that this migration had been performed in 2.0 already. 
Furthermore, in 3.0 the indexes are kept in a totally separate table from 
columns.

The bad news is that 2.0 is EOL and that I don't know a solid heuristic for 
determining whether or not we have this data missing. It's possible for a 
pre-upgrade 2.0 node to have completely empty {{system.schema_columns}} (sans 
system tables' columns themselves) table if the system had no {{REGULAR}} 
columns defined on any of the tables.

> If a node loses schema_columns SSTables it could delete all secondary indexes 
> from the schema
> -
>
> Key: CASSANDRA-9434
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9434
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Low
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.x
>
>
> It is possible that a single bad node can delete all secondary indexes if it 
> restarts and cannot read its schema_columns SSTables. Here's a reproduction:
> * Create a 2 node cluster (we saw it on 2.0.11)
> * Create the schema:
> {code}
> create keyspace myks with replication = {'class':'SimpleStrategy', 
> 'replication_factor':1};
> use myks;
> create table mytable (a text, b text, c text, PRIMARY KEY (a, b) );
> create index myindex on mytable(b);
> {code}
> NB index must be on clustering column to repro
> * Kill one node
> * Wipe its commitlog and system/schema_columns sstables.
> * Start it again
> * Run on this node
> select index_name from system.schema_columns where keyspace_name = 'myks' and 
> columnfamily_name = 'mytable' and column_name = 'b';
> and you'll see the index is null.
> * Run 'describe schema' on the other node. Sometimes it will not show the 
> index, but you might need to bounce for it to disappear.
> I think the culprit is SystemKeyspace.copyAllAliasesToColumnsProper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8716) "java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed" when running cleanup

2015-08-17 Thread Evin Callahan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700286#comment-14700286
 ] 

Evin Callahan commented on CASSANDRA-8716:
--

+1 for a workaround

> "java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory 
> was freed" when running cleanup
> --
>
> Key: CASSANDRA-8716
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8716
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67
>Reporter: Imri Zvik
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: qa-resolved
> Fix For: 2.0.13
>
> Attachments: 8716.txt, system.log.gz
>
>
> {code}Error occurred during cleanup
> java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was 
> freed
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115)
> at 
> org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177)
> at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
> at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
> at sun.rmi.transport.Transport$1.run(Transport.java:177)
> at sun.rmi.transport.Transport$1.run(Transport.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AssertionError: Memory was freed
> at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259)
> at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211)
> at 
> org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
> at 
> org.apache.cassandra.io.ss

[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

2015-08-17 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700242#comment-14700242
 ] 

Sam Tunnicliffe commented on CASSANDRA-6717:


I've fixed BatchLogManagerTest and a number of the thrift dtests. Also, updated 
the bundled java driver with the latest from the 
[JAVA-875|https://datastax-oss.atlassian.net/browse/JAVA-875] branch and the 
bundled python driver with the latest from 
[PYTHON-276|https://datastax-oss.atlassian.net/browse/PYTHON-276]. I'll commit 
to 3.0 when cassci is happy.

For future reference, the command to build a source dist of the python driver 
for internal use during dev is {code}python setup.py egg_info -b-`git rev-parse 
--short HEAD` sdist --formats=zip{code}



> Modernize schema tables
> ---
>
> Key: CASSANDRA-6717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Sylvain Lebresne
>Assignee: Sam Tunnicliffe
>  Labels: client-impacting, doc-impacting
> Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store 
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
> redundant (or almost, we'd need to store whether the table is COMPACT or not 
> too, which we don't currently is easy and probably a good idea anyway), it 
> can be entirely reconstructed from the infos in schema_columns (the same is 
> true of key_validator and subcomparator, and replacing default_validator by a 
> COMPACT_VALUE column in all case is relatively simple). And storing the 
> comparator as an opaque string broke concurrent updates of sub-part of said 
> comparator (concurrent collection addition or altering 2 separate clustering 
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
> same time, we'd need to be able to have more than one index definition for a 
> given column.
> # There is a few mismatches in table options between the one stored in the 
> schema and the one used when declaring/altering a table which would be nice 
> to fix. The compaction, compression and replication maps are one already 
> mentioned from CASSANDRA-4603, but also for some reason 
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
> in the schema table, and 'min/max_compaction_threshold' are column families 
> option in the schema but just compaction options for CQL (which makes more 
> sense).
> None of those issues are major, and we could probably deal with them 
> independently but it might be simpler to just fix them all in one shot so I 
> wanted to sum them all up here. In particular, the fact that 
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
> but it may limit future stuff too) which suggest we should migrate it to a 
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
> to rename schema_columnfamilies to schema_tables for the years to come since 
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace, 
> named 'schema' for instance (or 'system_schema' but I prefer the shorter 
> version), and fix all the issues above at once. Since we currently don't 
> exchange schema between nodes of different versions, all we'd need to do that 
> is a one shot startup migration, and overall, I think it could be simpler for 
> clients to deal with one clear migration than to have to handle minor 
> individual changes all over the place. I also think it's somewhat cleaner 
> conceptually to have schema tables in their own keyspace since they are 
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables 
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
>   name text,
>   column_names list,
>   column_types list
> )
> CREATE TABLE keyspaces (
>   name text PRIMARY KEY,
>   durable_writes boolean,
>   replication map,
>   user_types map
> )
> CREATE TYPE trigger_definition (
>   name text,
>   options map
> )
> CREATE TABLE tables (
>   keyspace text,
>   name text,
>   id uuid,
>   table_type text, // COMPACT, CQL or SUPER
>   dropped_columns map,
>   triggers map,
>   // options
>   comment text,
>   compaction map,
>   compression map,
>   read_repair_chance double,
>   dclocal_read_repair_chance double,
>   gc_grace_seconds int,
>   caching text,
>   rows_per

[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700234#comment-14700234
 ] 

Ariel Weisberg commented on CASSANDRA-8630:
---

I have an empty box I can run it on. Which compaction strategy are you taking 
those numbers from? When I run the test it does it 3 times once for each 
strategy.


> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz, flight_recorder_002_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10102) java.lang.UnsupportedOperationException after upgrade to 3.0

2015-08-17 Thread Russ Hatch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch updated CASSANDRA-10102:
---
Summary: java.lang.UnsupportedOperationException after upgrade to 3.0  
(was: java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1)

> java.lang.UnsupportedOperationException after upgrade to 3.0
> 
>
> Key: CASSANDRA-10102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10102
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
> Attachments: node1.log, node2.log, node3.log
>
>
> Upgrade tests are showing a potential issue. I'm seeing this during rolling 
> upgrades to 3.0 alpha 1, after one node has been upgraded to the alpha.
> I will attach cassandra logs here, node1.log is where most of the failures 
> are seen.
> {noformat}
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,888 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> INFO  [GossipStage:1] 2015-08-17 12:22:06,914 StorageService.java:1886 - Node 
> /127.0.0.2 state jump to normal
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,915 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> {noformat}
> Another exception showing in logs:
> {noformat}
> ERROR [SharedPool-Worker-1] 2015-08-17 12:22:19,358 ErrorMessage.java:336 - 
> Unexpected exception during request
> java.lang.UnsupportedOperationException: Version is 9
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:760)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:334)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:246)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:67)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:587)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:737)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:702) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1084)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:125) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:942) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:549) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:720)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:613)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:599)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStat

[jira] [Created] (CASSANDRA-10110) Windows dtest 3.0: udtencoding_test.py:TestUDTEncoding.udt_test

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10110:
---

 Summary: Windows dtest 3.0: 
udtencoding_test.py:TestUDTEncoding.udt_test
 Key: CASSANDRA-10110
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10110
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


Currently broken by CASSANDRA-7066 (thus depending on CASSANDRA-10109). Error 
message from CI yesterday was:
{noformat}
File "D:\Python27\lib\unittest\case.py", line 329, in run
testMethod()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\udtencoding_test.py",
 line 15, in udt_test
cluster.populate(3).start()
  File "build\bdist.win-amd64\egg\ccmlib\cluster.py", line 249, in start
p = node.start(update_pid=False, jvm_args=jvm_args, 
profile_options=profile_options)
  File "build\bdist.win-amd64\egg\ccmlib\node.py", line 447, in start
common.check_socket_available(itf)
  File "build\bdist.win-amd64\egg\ccmlib\common.py", line 343, in 
check_socket_available
raise UnavailableSocketError("Inet address %s:%s is not available: %s" % 
(addr, port, msg))
'Inet address 127.0.0.1:7000 is not available: [Errno 10013] An attempt was 
made to access a socket in a way forbidden by its access 
permissions\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
d:\\temp\\dtest-dpsz3i\n- >> end captured logging << 
-'
{noformat}

Failure history: [regression in build 
#17|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/udtencoding_test/TestUDTEncoding/udt_test/history/].
 Doesn't look like there was any real change to explain that though.

Env: Not sure if repro locally since CASSANDRA-7066 error is in the way. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10109) Windows dtest 3.0: ttl_test.py failures

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10109:
---

 Summary: Windows dtest 3.0: ttl_test.py failures
 Key: CASSANDRA-10109
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10109
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


ttl_test.py:TestTTL.update_column_ttl_with_default_ttl_test2
ttl_test.py:TestTTL.update_multiple_columns_ttl_test
ttl_test.py:TestTTL.update_single_column_ttl_test

Errors locally are different than CI from yesterday. Yesterday on CI we have 
timeouts and general node hangs. Today on all 3 tests when run locally I see:
{noformat}
Traceback (most recent call last):
  File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
errors))
AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
16:53:43,120 NoSpamLogger.java:97 - This platform does not support atomic 
directory streams (SecureDirectoryStream); race conditions when loading sstable 
files could occurr']
{noformat}

This traces back to the commit for CASSANDRA-7066 today by [~Stefania] and 
[~benedict].  Stefania - care to take this ticket and also look further into 
whether or not we're going to have issues with 7066 on Windows? That error 
message certainly *sounds* like it's not a good thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700217#comment-14700217
 ] 

Aleksey Yeschenko commented on CASSANDRA-9917:
--

I'm going to be a pedant and bring up another example, just because: you have 
nodes A and B, A has some hints/batches for B; A is down for 1.5 hours, then A 
comes up, but B goes down for 1.5 hours. No single node has been down for 
longer than max hint window, but, assuming gc gs/max hints window of 3 hours, 
none of the batches or hints have survived. You need repair. And with current 
MVs - to drop and recreate the MV?

What I'm saying is that whatever we do, it's going to be broken. Also that 
{{max_hints_window_in_ms}} should not be part of any calculations whatsoever, 
as you can ultimately infer nothing from it.

So let's just validate that it's not set to {{0}} and properly document the 
effects of gc gs in the MV documentation.

> MVs should validate gc grace seconds on the tables involved
> ---
>
> Key: CASSANDRA-9917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Paulo Motta
>  Labels: materializedviews
> Fix For: 3.0 beta 2
>
>
> For correctness reasons (potential resurrection of dropped values), batchlog 
> entries are TTLs with the lowest gc grace second of all the tables involved 
> in a batch.
> It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
> will be dead on arrival, and never replayed.
> We should probably warn against such LOGGED writes taking place, in general, 
> but for MVs, we must validate that gc gs on the base table (and on the MV 
> table, if we should allow altering gc gs there at all), is never set too low, 
> or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-10094:

Attachment: 10094-2.2.txt

Backport attached. Also available 
[here|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:10094-2.2].

Test already available:
* [dtest 
results|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10094-2.2-dtest/lastCompletedBuild/testReport/]
* [utest 
results|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10094-2.2-testall/lastCompletedBuild/testReport/]

> Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM 
> failure
> ---
>
> Key: CASSANDRA-10094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
> Attachments: 10094-2.2.txt
>
>
> Error:
> {noformat}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
>   Consistent since build #85
> Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10108) Windows dtest 3.0: sstablesplit_test.py:TestSSTableSplit.split_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10108:
---

 Summary: Windows dtest 3.0: 
sstablesplit_test.py:TestSSTableSplit.split_test fails
 Key: CASSANDRA-10108
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10108
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


Locally:
{noformat}
-- ma-28-big-Data.db-
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/supercsv/prefs/CsvPreference$Builder
at org.apache.cassandra.config.Config.(Config.java:240)
at 
org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:105)
at 
org.apache.cassandra.service.StorageService.getPartitioner(StorageService.java:220)
at 
org.apache.cassandra.service.StorageService.(StorageService.java:206)
at 
org.apache.cassandra.service.StorageService.(StorageService.java:211)
at 
org.apache.cassandra.schema.LegacySchemaTables.getSchemaPartitionsForTable(LegacySchemaTables.java:295)
at 
org.apache.cassandra.schema.LegacySchemaTables.readSchemaFromSystemTables(LegacySchemaTables.java:210)
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:108)
at 
org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:58)
Caused by: java.lang.ClassNotFoundException: 
org.supercsv.prefs.CsvPreference$Builder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 9 more
Number of sstables after split: 1. expected 21.0
{noformat}

on CI:
{noformat}
21.0 not less than or equal to 2

and

[node1 ERROR] Exception calling "CompareTo" with "1" argument(s): "Object must 
be of type 
String."
At D:\temp\dtest-i3xwjx\test\node1\conf\cassandra-env.ps1:336 char:9
+ if ($env:JVM_VERSION.CompareTo("1.8.0_40" -eq -1))
+ ~
+ CategoryInfo  : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : ArgumentException
-- ma-28-big-Data.db-
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/sstablesplit_test/TestSSTableSplit/split_test/history/]

Env: both CI and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10107) Windows dtest 3.0: TestScrub and TestScrubIndexes failures

2015-08-17 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-10107:

Summary: Windows dtest 3.0: TestScrub and TestScrubIndexes failures  (was: 
Windows dtest 3.0: )

> Windows dtest 3.0: TestScrub and TestScrubIndexes failures
> --
>
> Key: CASSANDRA-10107
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10107
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: dows dtest 3.0: TestScrub / TestScrubIndexes failures
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>  Labels: Windows
> Fix For: 3.0.x
>
>
> scrub_test.py:TestScrub.test_standalone_scrub
> scrub_test.py:TestScrub.test_standalone_scrub_essential_files_only
> scrub_test.py:TestScrubIndexes.test_standalone_scrub
> Somewhat different messages between CI and local, but consistent on env. 
> Locally, I see:
> {noformat}
> dtest: DEBUG: ERROR 20:41:20 This platform does not support atomic directory 
> streams (SecureDirectoryStream); race conditions when loading sstable files 
> could occurr
> {noformat}
> Consistently fails, both on CI and locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10107) Windows dtest 3.0:

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10107:
---

 Summary: Windows dtest 3.0: 
 Key: CASSANDRA-10107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10107
 Project: Cassandra
  Issue Type: Sub-task
  Components: dows dtest 3.0: TestScrub / TestScrubIndexes failures
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
 Fix For: 3.0.x


scrub_test.py:TestScrub.test_standalone_scrub
scrub_test.py:TestScrub.test_standalone_scrub_essential_files_only
scrub_test.py:TestScrubIndexes.test_standalone_scrub

Somewhat different messages between CI and local, but consistent on env. 
Locally, I see:
{noformat}
dtest: DEBUG: ERROR 20:41:20 This platform does not support atomic directory 
streams (SecureDirectoryStream); race conditions when loading sstable files 
could occurr
{noformat}

Consistently fails, both on CI and locally.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700177#comment-14700177
 ] 

Joshua McKenzie commented on CASSANDRA-10094:
-

Backport and attach here, I'll review and commit.

Sound good?

> Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM 
> failure
> ---
>
> Key: CASSANDRA-10094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Error:
> {noformat}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
>   Consistent since build #85
> Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10055) High CPU load for Cassandra 2.1.8

2015-08-17 Thread vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700173#comment-14700173
 ] 

vijay commented on CASSANDRA-10055:
---

Benedict, i the jstack and top command are taken relatively close to each 
other, i will try to get more statistics on this and get back.

Thanks 

> High CPU load for Cassandra 2.1.8
> -
>
> Key: CASSANDRA-10055
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10055
> Project: Cassandra
>  Issue Type: Bug
>  Components: Config
> Environment: Prod
>Reporter: vijay
> Attachments: dstst-lcdn.log, dstst-lcdn2.log, dstst-lcdn3.log, 
> dstst-lcdn4.log, dstst-lcdn5.log, dstst-lcdn6.log, js.log, js2.log, js3.log, 
> js4.log, js5.log, js6.log, top-bHn1-2.log, top-bHn1-3.log, top-bHn1-4.log, 
> top-bHn1-5.log, top-bHn1-6.log, top-bHn1.log
>
>
> We are seeing High CPU Load about 80% to 100% in Cassandra 2.1.8 when doing 
> Data ingest, we did not had this issue in 2.0.x version of Cassandra
> we tested this in different Cloud platforms and results are same.
> CPU: Tested with M3 2xlarge AWS instances
> Ingest rate: Injecting 1 million Inserts each insert is of 1000bytes
> no other Operations are happening except inserts in Cassandra
> let me know if more info is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10106) Windows dtest 3.0: TestRepair multiple failures

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10106:
---

 Summary: Windows dtest 3.0: TestRepair multiple failures
 Key: CASSANDRA-10106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10106
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


repair_test.py:TestRepair.dc_repair_test
repair_test.py:TestRepair.local_dc_repair_test
repair_test.py:TestRepair.simple_parallel_repair_test
repair_test.py:TestRepair.simple_sequential_repair_test

All failing w/the following error:
{noformat}
File "D:\Python27\lib\unittest\case.py", line 358, in run
self.tearDown()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\dtest.py", line 
532, in tearDown
raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
errors))
"Unexpected error in node3 node log: ['ERROR [STREAM-IN-/127.0.0.1] 2015-08-17 
00:41:09,426 StreamSession.java:520 - [Stream 
#a69fc140-4478-11e5-a8ae-4f8718583077] Streaming error occurred 
java.io.IOException: An existing connection was forcibly closed by the remote 
host \\tat sun.nio.ch.SocketDispatcher.read0(Native Method) ~[na:1.8.0_45] 
\\tat sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) ~[na:1.8.0_45] 
\\tat sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_45] 
\\tat sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.8.0_45] \\tat 
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[na:1.8.0_45] 
\\tat 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:53)
 ~[main/:na] \\tat 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
 ~[main/:na] \\tat java.lang.Thread.run(Thread.java:745) 
[na:1.8.0_45]']\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
d:\\temp\\dtest-3kmbjb\ndtest: DEBUG: Starting cluster..\ndtest: DEBUG: 
Inserting data...\ndtest: DEBUG: Checking data on node3...\ndtest: DEBUG: 
Checking data on node1...\ndtest: DEBUG: Checking data on node2...\ndtest: 
DEBUG: starting repair...\ndtest: DEBUG: Repair time: 5.3782098\ndtest: 
DEBUG: removing ccm cluster test at: d:\\temp\\dtest-3kmbjb\ndtest: DEBUG: 
clearing ssl stores from [d:\\temp\\dtest-3kmbjb] 
directory\n- >> end captured logging << 
-"
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/repair_test/TestRepair/dc_repair_test/history/]

Env: ci and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700138#comment-14700138
 ] 

Paulo Motta commented on CASSANDRA-10094:
-

[~JoshuaMcKenzie] I originally created that patch for 2.2+ in the context of 
[CASSANDRA-8515|https://issues.apache.org/jira/browse/CASSANDRA-8515].  However 
there was a problem with the tests not executing correctly on cassci due to 
other problems. In the mean time, I noticed one of the tests was still not 
working on Windows and updated the original patch, but forgot to mention it on 
the JIRA ticket (sorry for that). After tests were passing on cassci, 
[~benedict] probably commited an older version of the patch. I suppose 
d2da7606abebd98b11f8b7ec692aa7dcf5388151 was basically to update the original 
commit to the updated patch, but it was only applied to 3.0+, it needs to be 
backported to 2.2.

> Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM 
> failure
> ---
>
> Key: CASSANDRA-10094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Error:
> {noformat}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
>   Consistent since build #85
> Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10105) Windows dtest 3.0: TestOfflineTools failures

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10105:
---

 Summary: Windows dtest 3.0: TestOfflineTools failures
 Key: CASSANDRA-10105
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10105
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
 Fix For: 3.0.x


offline_tools_test.py:TestOfflineTools.sstablelevelreset_test
offline_tools_test.py:TestOfflineTools.sstableofflinerelevel_test

Both tests fail with the following:
{noformat}
Traceback (most recent call last):
  File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
errors))
AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
15:55:05,060 NoSpamLogger.java:97 - This platform does not support atomic 
directory streams (SecureDirectoryStream); race conditions when loading sstable 
files could occurr']
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/jmx_test/TestJMX/netstats_test/history/]

Env: ci and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700124#comment-14700124
 ] 

Blake Eggleston commented on CASSANDRA-9749:


[~benedict] I think most of that discussion occurred in the first 10 or so 
comments on this ticket. At least I don't remember there being another 
discussion outside of it.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10104) Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10104:
---

 Summary: Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
 Key: CASSANDRA-10104
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10104
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


{noformat}
Unexpected error in node1 node log: ['ERROR [HintedHandoff:2] 2015-08-16 
23:14:04,419 CassandraDaemon.java:191 - Exception in thread 
Thread[HintedHandoff:2,1,main] 
org.apache.cassandra.exceptions.WriteFailureException: Operation failed - 
received 0 responses and 1 failures \tat 
org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:106)
 ~[main/:na] \tat 
org.apache.cassandra.db.HintedHandOffManager.checkDelivered(HintedHandOffManager.java:358)
 ~[main/:na] \tat 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:414)
 ~[main/:na] \tat 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:346)
 ~[main/:na] \tat 
org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:91)
 ~[main/:na] \tat 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:537)
 ~[main/:na] \tat 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_45] \tat 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_45] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]']
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: d:\temp\dtest-j1ttp3
dtest: DEBUG: Nodetool command 
'D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra\bin\nodetool.bat -h 
localhost -p 7100 netstats' failed; exit status: 1; stdout: Starting NodeTool
; stderr: nodetool: Failed to connect to 'localhost:7100' - ConnectException: 
'Connection refused: connect'.

dtest: DEBUG: removing ccm cluster test at: d:\temp\dtest-j1ttp3
dtest: DEBUG: clearing ssl stores from [d:\temp\dtest-j1ttp3] directory
- >> end captured logging << -
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/jmx_test/TestJMX/netstats_test/history/].
 Looks to have regressed on build 
[#5|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/5/]
 which seems unlikely given the commit.

Env: Both, though on a local run the test fails due to:

{noformat}
Traceback (most recent call last):
  File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
errors))
AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
15:42:07,717 NoSpamLogger.java:97 - This platform does not support atomic 
directory streams (SecureDirectoryStream); race conditions when loading sstable 
files could occurr', 'ERROR [main] 2015-08-17 15:50:43,978 NoSpamLogger.java:97 
- This platform does not support atomic directory streams 
(SecureDirectoryStream); race conditions when loading sstable files could 
occurr']
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10082) Transactional classes shouldn't also implement streams, channels, etc

2015-08-17 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700117#comment-14700117
 ] 

Blake Eggleston commented on CASSANDRA-10082:
-

I wouldn't say it's never ever the right thing to do. Though I would say it's 
not right for most AutoClosable use cases that intersect with Transactional 
implementations (especially OutputStream). In fact, if you're using a 
Transactional class as an OutputStream for the purpose of making a write a 
noop, you may be committing reviewer abuse :).

Regarding the class living in SequentialWriter, it's not meant to be a general 
purpose wrapper, but a one off thing for SequentialWriter. I usually wouldn't 
create a generic solution unless it turns out to be a problem with more than 
just SequentialWriter.


> Transactional classes shouldn't also implement streams, channels, etc
> -
>
> Key: CASSANDRA-10082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10082
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Attachments: 
> 0001-replacing-SequentialWriter-OutputStream-extension-wi.patch
>
>
> Since the close method on the Transactional interface means "abort if commit 
> hasn't been called", mixing Transactional and AutoCloseable interfaces where 
> close means "we're done here" is pretty much never the right thing to do. 
> The only class that does this is SequentialWriter. It's not used in a way 
> that causes a problem, but it's still a potential hazard for future 
> development.
> The attached patch replaces the SequentialWriter OutputStream implementation 
> with a wrapper class that implements the expected behavior on close, and adds 
> a warning to the Transactional interface. It also adds a unit test that 
> demonstrates the problem without the fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10102) java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1

2015-08-17 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700110#comment-14700110
 ] 

Russ Hatch commented on CASSANDRA-10102:


[~iamaleksey] I'm still seeing some kind of issue post-upgrade on 3.0 HEAD, not 
sure if it's the same problem or not:

one node shows:

{noformat}
ERROR [SharedPool-Worker-1] 2015-08-17 13:38:03,311 Message.java:611 - 
Unexpected exception during request; channel = [id: 0x68fac00f, 
/127.0.0.1:57115 => /127.0.0.1:9042]
java.lang.AssertionError: null
at 
org.apache.cassandra.db.ReadCommand$Serializer.serializedSize(ReadCommand.java:520)
 ~[main/:na]
at 
org.apache.cassandra.db.ReadCommand$Serializer.serializedSize(ReadCommand.java:461)
 ~[main/:na]
at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) 
~[main/:na]
at 
org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:72)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:583)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:733) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:676) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRRWithFailure(MessagingService.java:659)
 ~[main/:na]
at 
org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:103)
 ~[main/:na]
at 
org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:76)
 ~[main/:na]
at 
org.apache.cassandra.service.AbstractReadExecutor$AlwaysSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:323)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1599)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1554) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1501) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1420) 
~[main/:na]
at 
org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:457)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:232)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:202)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:72)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:204)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:470)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:447)
 ~[main/:na]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:139)
 ~[main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [main/:na]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_45]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [main/:na]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[main/:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
{noformat}


and another node shows:
{noformat}
ERROR [HintedHandoff:2] 2015-08-17 13:38:07,612 CassandraDaemon.java:191 - 
Exception in thread Thread[HintedHandoff:2,1,main]
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.UnsupportedOperationException
at 
org.apache.cassandra.db.HintedHandOffManager.compact(HintedHandOffManager.java:281)
 ~[main/:na]
at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:535)
 ~[main/:na]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool

[jira] [Commented] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700104#comment-14700104
 ] 

Joshua McKenzie commented on CASSANDRA-10094:
-

The commit message on that isn't clear as to where it originates:
{noformat}
commit d2da7606abebd98b11f8b7ec692aa7dcf5388151
Author: Benedict Elliott Smith 
Date:   Mon Aug 17 09:52:13 2015 +0100

fix CommitLogFailurePolicyTest
{noformat}

[~benedict]: What ticket was that for, if any? And could we get a backport of 
that fix to 2.2 in order to fix the failing test on that branch?

> Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM 
> failure
> ---
>
> Key: CASSANDRA-10094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Error:
> {noformat}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
>   Consistent since build #85
> Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10089) NullPointerException in Gossip handleStateNormal

2015-08-17 Thread Jim Witschey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Witschey updated CASSANDRA-10089:
-
Reproduced In: 2.2.x, 3.0.x

> NullPointerException in Gossip handleStateNormal
> 
>
> Key: CASSANDRA-10089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10089
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefania
>Assignee: Stefania
>
> Whilst comparing dtests for CASSANDRA-9970 I found [this failing 
> dtest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-9970-dtest/lastCompletedBuild/testReport/consistency_test/TestConsistency/short_read_test/]
>  in 2.2:
> {code}
> Unexpected error in node1 node log: ['ERROR [GossipStage:1] 2015-08-14 
> 15:39:57,873 CassandraDaemon.java:183 - Exception in thread 
> Thread[GossipStage:1,5,main] java.lang.NullPointerException: null \tat 
> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1731)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1804)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1857)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1629)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2312) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1025) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1106) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>  ~[main/:na] \tat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) 
> ~[main/:na] \tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80] \tat 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[na:1.7.0_80] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_80]']
> {code}
> I wasn't able to find it on unpatched branches  but it is clearly not related 
> to CASSANDRA-9970, if anything it could have been a side effect of 
> CASSANDRA-9871.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10089) NullPointerException in Gossip handleStateNormal

2015-08-17 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700098#comment-14700098
 ] 

Jim Witschey commented on CASSANDRA-10089:
--

The NPE 
[here|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastCompletedBuild/testReport/junit/consistency_test/TestConsistency/short_read_reversed_test/]
 looks similar, so this likely affects trunk as well.

> NullPointerException in Gossip handleStateNormal
> 
>
> Key: CASSANDRA-10089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10089
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefania
>Assignee: Stefania
>
> Whilst comparing dtests for CASSANDRA-9970 I found [this failing 
> dtest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-9970-dtest/lastCompletedBuild/testReport/consistency_test/TestConsistency/short_read_test/]
>  in 2.2:
> {code}
> Unexpected error in node1 node log: ['ERROR [GossipStage:1] 2015-08-14 
> 15:39:57,873 CassandraDaemon.java:183 - Exception in thread 
> Thread[GossipStage:1,5,main] java.lang.NullPointerException: null \tat 
> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1731)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1804)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1857)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1629)
>  ~[main/:na] \tat 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2312) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1025) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1106) 
> ~[main/:na] \tat 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>  ~[main/:na] \tat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) 
> ~[main/:na] \tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_80] \tat 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[na:1.7.0_80] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_80]']
> {code}
> I wasn't able to find it on unpatched branches  but it is clearly not related 
> to CASSANDRA-9970, if anything it could have been a side effect of 
> CASSANDRA-9871.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10043) A NullPointerException is thrown if the column name is unknown for an IN relation

2015-08-17 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700095#comment-14700095
 ] 

Benjamin Lerer commented on CASSANDRA-10043:


[~snazy] could you review?

> A NullPointerException is thrown if the column name is unknown for an IN 
> relation
> -
>
> Key: CASSANDRA-10043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10043
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Attachments: 10043-2.2.txt, 10043-3.0.txt
>
>
> {code}
> cqlsh:test> create table newTable (a int, b int, c int, primary key(a, b));
> cqlsh:test> select * from newTable where d in (1, 2);
> ServerError:  message="java.lang.NullPointerException">
> {code}
> The problem seems to occur only for {{IN}} restrictions 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10043) A NullPointerException is thrown if the column name is unknown for an IN relation

2015-08-17 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-10043:
---
Attachment: 10043-3.0.txt
10043-2.2.txt

The patches fix the problem and add some unit tests to verify the behaviour.

* The results of the unit test for 2.2 are 
[here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-10043-2.2-dtest/lastCompletedBuild/testReport/]
* The results of the Dtest for 2.2 are 
[here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-10043-2.2-dtest/lastCompletedBuild/testReport/]
* The results of the unit test for 3.0 are 
[here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-10043-3.0-dtest/lastCompletedBuild/testReport/]
* The results of the Dtest for 3.0 are 
[here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-10043-3.0-dtest/lastCompletedBuild/testReport/]

> A NullPointerException is thrown if the column name is unknown for an IN 
> relation
> -
>
> Key: CASSANDRA-10043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10043
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Attachments: 10043-2.2.txt, 10043-3.0.txt
>
>
> {code}
> cqlsh:test> create table newTable (a int, b int, c int, primary key(a, b));
> cqlsh:test> select * from newTable where d in (1, 2);
> ServerError:  message="java.lang.NullPointerException">
> {code}
> The problem seems to occur only for {{IN}} restrictions 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700089#comment-14700089
 ] 

Jonathan Ellis commented on CASSANDRA-9917:
---

bq. we now need repair

Which is why "low" gcgs should be defined as lower than max hint window, 
because that's what causes problems.

> MVs should validate gc grace seconds on the tables involved
> ---
>
> Key: CASSANDRA-9917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Paulo Motta
>  Labels: materializedviews
> Fix For: 3.0 beta 2
>
>
> For correctness reasons (potential resurrection of dropped values), batchlog 
> entries are TTLs with the lowest gc grace second of all the tables involved 
> in a batch.
> It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
> will be dead on arrival, and never replayed.
> We should probably warn against such LOGGED writes taking place, in general, 
> but for MVs, we must validate that gc gs on the base table (and on the MV 
> table, if we should allow altering gc gs there at all), is never set too low, 
> or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9446) Failure detector should ignore local pauses per endpoint

2015-08-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9446:
--
Assignee: Stefania  (was: Brandon Williams)

> Failure detector should ignore local pauses per endpoint
> 
>
> Key: CASSANDRA-9446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9446
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Stefania
>Priority: Minor
> Attachments: 9446.txt, 9644-v2.txt
>
>
> In CASSANDRA-9183, we added a feature to ignore local pauses. But it will 
> only not mark 2 endpoints as down. 
> We should do this per endpoint as suggested by Brandon in CASSANDRA-9183. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10086) Add a "CLEAR" cqlsh command to clear the console

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700070#comment-14700070
 ] 

Jonathan Ellis commented on CASSANDRA-10086:


I don't think we need it earlier than 3.0 since there is a simple alternative 
in ctrl-L as Paul notes.

> Add a "CLEAR" cqlsh command to clear the console
> 
>
> Key: CASSANDRA-10086
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10086
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paul O'Fallon
>Priority: Trivial
>  Labels: cqlsh, doc-impacting
> Attachments: 10086.txt
>
>
> It would be very helpful to have a "CLEAR" command to clear the cqlsh 
> console.  I learned (after researching a patch for this) that lowercase 
> CTRL+L will clear the screen, but having a discrete command would make that 
> more obvious.  To match the expectations of Windows users, an alias to "CLS" 
> would be nice as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700069#comment-14700069
 ] 

Aleksey Yeschenko commented on CASSANDRA-9917:
--

bq. True, but let me clarify: if no node in the cluster is down for longer than 
max hint window, and you have no hardware failures, and you have batch 
commitlog enabled, then you won't need repair. Fair?

I might be misunderstanding the context, but no, in general this is not true. A 
request times out, a hint - or a batchlog entry gets written - a table in the 
mutation has a low gc gs - the batchlog/hint entry expires before it can be 
replayed, and we now need repair.

bq. Especially since Paulo independently came up with the same value we use as 
default max hint window here.

Right. Independently.

> MVs should validate gc grace seconds on the tables involved
> ---
>
> Key: CASSANDRA-9917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Paulo Motta
>  Labels: materializedviews
> Fix For: 3.0 beta 2
>
>
> For correctness reasons (potential resurrection of dropped values), batchlog 
> entries are TTLs with the lowest gc grace second of all the tables involved 
> in a batch.
> It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
> will be dead on arrival, and never replayed.
> We should probably warn against such LOGGED writes taking place, in general, 
> but for MVs, we must validate that gc gs on the base table (and on the MV 
> table, if we should allow altering gc gs there at all), is never set too low, 
> or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8970) Allow custom time_format on cqlsh COPY TO

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700067#comment-14700067
 ] 

Jonathan Ellis commented on CASSANDRA-8970:
---

bq. it becomes a "nice to have" if COPY FROM could interpret the default 
exported timestamps correctly

Did anyone create a ticket for that?

> Allow custom time_format on cqlsh COPY TO
> -
>
> Key: CASSANDRA-8970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8970
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Aaron Ploetz
>Priority: Trivial
>  Labels: cqlsh
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-8970.patch
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> When executing a COPY TO from cqlsh, the user is currently has no control 
> over the format of exported timestamp columns.  If the user has indicated a 
> {{time_format}} in their cqlshrc file, that format will be used.  Otherwise, 
> the system default format will be used.
> The problem comes into play when the timestamp format used on a COPY TO, is 
> not valid when the data is sent back into Cassandra with a COPY FROM.
> For instance, if a user has {{time_format = %Y-%m-%d %H:%M:%S%Z}} specified 
> in their cqlshrc, COPY TO will format timestamp columns like this:
> {{userid|posttime|postcontent}}
> {{0|2015-03-14 14:59:00CDT|rtyeryerweh}}
> {{0|2015-03-14 14:58:00CDT|sdfsdfsdgfjdsgojr}}
> {{0|2015-03-12 14:27:00CDT|sdgfjdsgojr}}
> Executing a COPY FROM on that same file will produce an "unable to coerce to 
> formatted date(long)" error.
> Right now, the only way to change the way timestamps are formatted is to exit 
> cqlsh, modify the {{time_format}} property in cqlshrc, and restart cqlsh.  
> The ability to specify a COPY option of TIME_FORMAT with a Python strftime 
> format, would allow the user to quickly alter the timestamp format for 
> export, without reconfiguring cqlsh.
> {{aploetz@cqlsh:stackoverflow> COPY posts1 TO '/home/aploetz/posts1.csv' WITH 
> DELIMITER='|' AND HEADER=true AND TIME_FORMAT='%Y-%m-%d %H:%M:%S%z;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700063#comment-14700063
 ] 

Paulo Motta commented on CASSANDRA-10094:
-

I'm able to reproduce it locally. The test is fixed after applying 
d2da7606abebd98b11f8b7ec692aa7dcf5388151, which was committed to 3.0+.

* [dtest 
results|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10094-2.2-dtest/lastCompletedBuild/testReport/]
* [utest 
results|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10094-2.2-testall/lastCompletedBuild/testReport/]

> Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM 
> failure
> ---
>
> Key: CASSANDRA-10094
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Error:
> {noformat}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
>   Consistent since build #85
> Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700046#comment-14700046
 ] 

Jonathan Ellis commented on CASSANDRA-9917:
---

bq. It only affects the decision to write a hint in the first place (down for 
long than the window? stop writing hints)... It's most often true that if a 
node has been down for longer than max_hint_window_in_mx, it is going to have 
data missing, yes. But there are no guarantees that it being down for shorter 
than that means it doesn't.

True, but let me clarify: if no node in the cluster is down for longer than max 
hint window, and you have no hardware failures, and you have batch commitlog 
enabled, then you won't need repair.  Fair?

I don't really see a difference vs min_batchlog_ttl.  Especially since Paulo 
independently came up with the same value we use as default max hint window 
here.

Let's not offer users more tuning knobs than they can meaningfully distinguish 
between.

> MVs should validate gc grace seconds on the tables involved
> ---
>
> Key: CASSANDRA-9917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Paulo Motta
>  Labels: materializedviews
> Fix For: 3.0 beta 2
>
>
> For correctness reasons (potential resurrection of dropped values), batchlog 
> entries are TTLs with the lowest gc grace second of all the tables involved 
> in a batch.
> It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
> will be dead on arrival, and never replayed.
> We should probably warn against such LOGGED writes taking place, in general, 
> but for MVs, we must validate that gc gs on the base table (and on the MV 
> table, if we should allow altering gc gs there at all), is never set too low, 
> or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10102) java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700032#comment-14700032
 ] 

Aleksey Yeschenko commented on CASSANDRA-10102:
---

Can you test with the current cassandra-3.0 head? There was an issue with the 
alpha, and that was CASSANDRA-9704 not committed as planned.

It is now, and this *should* work.

> java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1
> ---
>
> Key: CASSANDRA-10102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10102
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
> Attachments: node1.log, node2.log, node3.log
>
>
> Upgrade tests are showing a potential issue. I'm seeing this during rolling 
> upgrades to 3.0 alpha 1, after one node has been upgraded to the alpha.
> I will attach cassandra logs here, node1.log is where most of the failures 
> are seen.
> {noformat}
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,888 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> INFO  [GossipStage:1] 2015-08-17 12:22:06,914 StorageService.java:1886 - Node 
> /127.0.0.2 state jump to normal
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,915 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> {noformat}
> Another exception showing in logs:
> {noformat}
> ERROR [SharedPool-Worker-1] 2015-08-17 12:22:19,358 ErrorMessage.java:336 - 
> Unexpected exception during request
> java.lang.UnsupportedOperationException: Version is 9
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:760)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:334)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:246)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:67)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:587)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:737)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:702) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1084)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:125) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:942) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:549) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:720)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:613)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(Modi

[jira] [Created] (CASSANDRA-10103) Windows dtest 3.0: incremental_repair_test.py:TestIncRepair.sstable_repairedset_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10103:
---

 Summary: Windows dtest 3.0: 
incremental_repair_test.py:TestIncRepair.sstable_repairedset_test fails
 Key: CASSANDRA-10103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10103
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


{noformat}
File "D:\Python27\lib\unittest\case.py", line 329, in run
testMethod()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\incremental_repair_test.py",
 line 165, in sstable_repairedset_test
self.assertGreaterEqual(len(uniquematches), 2)
  File "D:\Python27\lib\unittest\case.py", line 948, in assertGreaterEqual
self.fail(self._formatMessage(msg, standardMsg))
  File "D:\Python27\lib\unittest\case.py", line 410, in fail
raise self.failureException(msg)
'0 not greater than or equal to 2\n >> begin captured 
logging << \ndtest: DEBUG: cluster ccm directory: 
d:\\temp\\dtest-pq7lpx\ndtest: DEBUG: []\n- >> end captured 
logging << -'
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/hintedhandoff_test/TestHintedHandoffConfig/hintedhandoff_dc_disabled_test/history/]

Env: both CI and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10102) java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1

2015-08-17 Thread Russ Hatch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch updated CASSANDRA-10102:
---
Attachment: node3.log
node2.log
node1.log

adding logs

> java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1
> ---
>
> Key: CASSANDRA-10102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10102
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
> Attachments: node1.log, node2.log, node3.log
>
>
> Upgrade tests are showing a potential issue. I'm seeing this during rolling 
> upgrades to 3.0 alpha 1, after one node has been upgraded to the alpha.
> I will attach cassandra logs here, node1.log is where most of the failures 
> are seen.
> {noformat}
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,888 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> INFO  [GossipStage:1] 2015-08-17 12:22:06,914 StorageService.java:1886 - Node 
> /127.0.0.2 state jump to normal
> ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,915 
> CassandraDaemon.java:189 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.UnsupportedOperationException: null
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
>  ~[main/:na]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
>  ~[main/:na]
> {noformat}
> Another exception showing in logs:
> {noformat}
> ERROR [SharedPool-Worker-1] 2015-08-17 12:22:19,358 ErrorMessage.java:336 - 
> Unexpected exception during request
> java.lang.UnsupportedOperationException: Version is 9
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:760)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:334)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:246)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) 
> ~[main/:na]
> at 
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:67)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:587)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:737)
>  ~[main/:na]
> at 
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:702) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1084)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:125) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:942) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:549) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:720)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:613)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:599)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:204)
>  ~[main/:na]

[jira] [Created] (CASSANDRA-10102) java.lang.UnsupportedOperationException after upgrade to 3.0 alpha1

2015-08-17 Thread Russ Hatch (JIRA)
Russ Hatch created CASSANDRA-10102:
--

 Summary: java.lang.UnsupportedOperationException after upgrade to 
3.0 alpha1
 Key: CASSANDRA-10102
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10102
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch


Upgrade tests are showing a potential issue. I'm seeing this during rolling 
upgrades to 3.0 alpha 1, after one node has been upgraded to the alpha.

I will attach cassandra logs here, node1.log is where most of the failures are 
seen.

{noformat}
ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,888 
CassandraDaemon.java:189 - Exception in thread 
Thread[MessagingService-Incoming-/127.0.0.1,5,main]
java.lang.UnsupportedOperationException: null
at 
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
 ~[main/:na]
at 
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
 ~[main/:na]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
 ~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
 ~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
 ~[main/:na]
INFO  [GossipStage:1] 2015-08-17 12:22:06,914 StorageService.java:1886 - Node 
/127.0.0.2 state jump to normal
ERROR [MessagingService-Incoming-/127.0.0.1] 2015-08-17 12:22:06,915 
CassandraDaemon.java:189 - Exception in thread 
Thread[MessagingService-Incoming-/127.0.0.1,5,main]
java.lang.UnsupportedOperationException: null
at 
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:485)
 ~[main/:na]
at 
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:444)
 ~[main/:na]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
 ~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:172)
 ~[main/:na]
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:90)
 ~[main/:na]
{noformat}

Another exception showing in logs:
{noformat}
ERROR [SharedPool-Worker-1] 2015-08-17 12:22:19,358 ErrorMessage.java:336 - 
Unexpected exception during request
java.lang.UnsupportedOperationException: Version is 9
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serializedSize(PartitionUpdate.java:760)
 ~[main/:na]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:334)
 ~[main/:na]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.serializedSize(Mutation.java:246)
 ~[main/:na]
at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) 
~[main/:na]
at 
org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:67)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:587)
 ~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:737) 
~[main/:na]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:702) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1084)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$2.apply(StorageProxy.java:125) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:942) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:549) 
~[main/:na]
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:720)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:613)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:599)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:204)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:470)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:447)
 ~[main/:na]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:139)
 ~[main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [main/:na]
 

[jira] [Created] (CASSANDRA-10101) Windows dtest 3.0: HintedHandoff tests failing

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10101:
---

 Summary: Windows dtest 3.0: HintedHandoff tests failing
 Key: CASSANDRA-10101
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10101
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
 Fix For: 3.0.x


hintedhandoff_test.py:TestHintedHandoffConfig.hintedhandoff_dc_disabled_test
hintedhandoff_test.py:TestHintedHandoffConfig.hintedhandoff_dc_reenabled_test
hintedhandoff_test.py:TestHintedHandoffConfig.hintedhandoff_disabled_test
hintedhandoff_test.py:TestHintedHandoffConfig.hintedhandoff_enabled_test
hintedhandoff_test.py:TestHintedHandoffConfig.nodetool_test

All are failing with some variant of the following:
{noformat}
File "D:\Python27\lib\unittest\case.py", line 329, in run
testMethod()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\hintedhandoff_test.py",
 line 130, in hintedhandoff_dc_disabled_test
self.assertEqual('Hinted handoff is running\nData center dc1 is disabled', 
res.rstrip())
  File "D:\Python27\lib\unittest\case.py", line 513, in assertEqual
assertion_func(first, second, msg=msg)
  File "D:\Python27\lib\unittest\case.py", line 506, in _baseAssertEqual
raise self.failureException(msg)
"'Hinted handoff is running\\nData center dc1 is disabled' != 'Starting 
NodeTool\\r\\nHinted handoff is running\\r\\nData center dc1 is 
disabled'\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
d:\\temp\\dtest-pddrcf\n- >> end captured logging << 
-"
{noformat}

Failure history: consistent for all jobs

Env: Both ci and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10100) Windows dtest 3.0: commitlog_test.py:TestCommitLog.stop_failure_policy_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10100:
---

 Summary: Windows dtest 3.0: 
commitlog_test.py:TestCommitLog.stop_failure_policy_test fails
 Key: CASSANDRA-10100
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10100
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


{noformat}
FAIL: stop_failure_policy_test (commitlog_test.TestCommitLog)
--
Traceback (most recent call last):
  File "c:\src\cassandra-dtest\commitlog_test.py", line 258, in 
stop_failure_policy_test
self.assertTrue(failure, "Cannot find the commitlog failure message in 
logs")
AssertionError: Cannot find the commitlog failure message in logs
 >> begin captured logging << 
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/commitlog_test/TestCommitLog/small_segment_size_test/history/]

Env: Both CI and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9872) only check KeyCache when it is enabled

2015-08-17 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699957#comment-14699957
 ] 

Chris Burroughs commented on CASSANDRA-9872:


I *think* this is correct (and simpler!) with all of the 3.0 branch changes.  I 
looked into adding unit tests to `KeyCacheTest` but it's pretty end to end and 
I didn't see any thing existing tests that call `getCachedPosition` directly.

> only check KeyCache when it is enabled
> --
>
> Key: CASSANDRA-9872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9872
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Chris Burroughs
>  Labels: cache, metrics
> Attachments: j9872-2.0-v1.txt, j9872-3.0-v1.txt
>
>
> If the KeyCache exists (because at least one column family is using it) we 
> currenlty check the key cache even for requests to column families where the 
> key cache is disabled.  I think it would be better to only check the cache if 
> entries *could* be there.
>  * This will align the key cache with how the row cache behaves.
>  * This makes the key cache metrics much more useful.  For example, 
> 'requests' becomes 'requests to things that could be in the key cache' and 
> not just 'total requests'.
>  * This migh be a micro-optimization saving a few metric update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9882) DTCS (maybe other strategies) can block flushing when there are lots of sstables

2015-08-17 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699956#comment-14699956
 ] 

Yuki Morishita commented on CASSANDRA-9882:
---

+1 for fixup.
Also created CASSANDRA-10099 to further discuss concurrency issue.

> DTCS (maybe other strategies) can block flushing when there are lots of 
> sstables
> 
>
> Key: CASSANDRA-9882
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9882
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Marcus Eriksson
>  Labels: dtcs
> Fix For: 2.1.9, 2.0.17, 2.2.1, 3.0 beta 1
>
>
> MemtableFlushWriter tasks can get blocked by Compaction 
> getNextBackgroundTask.  This is in a wonky cluster with 200k sstables in the 
> CF, but seems bad for flushing to be blocked by getNextBackgroundTask when we 
> are trying to make these new "smart" strategies that may take some time to 
> calculate what to do.
> {noformat}
> "MemtableFlushWriter:21" daemon prio=10 tid=0x7ff7ad965000 nid=0x6693 
> waiting for monitor entry [0x7ff78a667000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:237)
>   - waiting to lock <0x0006fcdbbf60> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
>   at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
>   at 
> org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1475)
>   at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1127)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - <0x000743b3ac38> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> "MemtableFlushWriter:19" daemon prio=10 tid=0x7ff7ac57a000 nid=0x649b 
> waiting for monitor entry [0x7ff78b8ee000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:237)
>   - waiting to lock <0x0006fcdbbf60> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
>   at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
>   at 
> org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1475)
>   at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1127)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "CompactionExecutor:14" daemon prio=10 tid=0x7ff7ad359800 nid=0x4d59 
> runnable [0x7fecce3ea000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.cassandra.io.sstable.SSTableReader.equals(SSTableReader.java:628)
>   at 
> com.google.common.collect.ImmutableSet.construct(ImmutableSet.java:206)
>   at 
> com.google.common.collect.ImmutableSet.construct(ImmutableSet.java:220)
>   at 
> com.google.common.collect.ImmutableSet.access$000(ImmutableSet.java:74)
>   at 
> com.google.common.collect.ImmutableSet$Builder.build(ImmutableSet.java:531)
>   at com.google.common.collect.Sets$1.immutableCopy(Sets.java:606)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(Colu

[jira] [Created] (CASSANDRA-10099) Improve concurrency in CompactionStrategyManager

2015-08-17 Thread Yuki Morishita (JIRA)
Yuki Morishita created CASSANDRA-10099:
--

 Summary: Improve concurrency in CompactionStrategyManager
 Key: CASSANDRA-10099
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10099
 Project: Cassandra
  Issue Type: Improvement
Reporter: Yuki Morishita
 Fix For: 3.x


Continue discussion from CASSANDRA-9882.

CompactionStrategyManager(WrappingCompactionStrategy for <3.0) tracks SSTable 
changes mainly for separating repaired / unrepaired SSTables (+ LCS manages 
level).

This is blocking operation, and can lead to block of flush etc. when 
determining next background task takes longer.

Explore the way to mitigate this concurrency issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10098) Windows dtest 3.0: commitlog_test.py:TestCommitLog.small_segment_size_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10098:
---

 Summary: Windows dtest 3.0: 
commitlog_test.py:TestCommitLog.small_segment_size_test fails
 Key: CASSANDRA-10098
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10098
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


{noformat}
  File "D:\Python27\lib\unittest\case.py", line 329, in run
testMethod()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\tools.py", line 
243, in wrapped
f(obj)
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\commitlog_test.py",
 line 226, in small_segment_size_test
self._commitlog_test(segment_size_in_mb, 62.5, 13, files_error=0.2)
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\commitlog_test.py",
 line 99, in _commitlog_test
error=files_error)
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\assertions.py", 
line 62, in assert_almost_equal
assert vmin > vmax * (1.0 - error) or vmin == vmax, "values not within 
%.2f%% of the max: %s" % (error * 100, args)
'values not within 20.00% of the max: (10, 13)\n >> begin 
captured logging << \ndtest: DEBUG: cluster ccm directory: 
d:\\temp\\dtest-qnguzs\n- >> end captured logging << 
-'
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/commitlog_test/TestCommitLog/small_segment_size_test/]

env: Both ci and local



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9872) only check KeyCache when it is enabled

2015-08-17 Thread Chris Burroughs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Burroughs updated CASSANDRA-9872:
---
Attachment: j9872-3.0-v1.txt

> only check KeyCache when it is enabled
> --
>
> Key: CASSANDRA-9872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9872
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Chris Burroughs
>  Labels: cache, metrics
> Attachments: j9872-2.0-v1.txt, j9872-3.0-v1.txt
>
>
> If the KeyCache exists (because at least one column family is using it) we 
> currenlty check the key cache even for requests to column families where the 
> key cache is disabled.  I think it would be better to only check the cache if 
> entries *could* be there.
>  * This will align the key cache with how the row cache behaves.
>  * This makes the key cache metrics much more useful.  For example, 
> 'requests' becomes 'requests to things that could be in the key cache' and 
> not just 'total requests'.
>  * This migh be a micro-optimization saving a few metric update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10097) Windows dtest 3.0: bootstrap_test.py:TestBootstrap.bootstrap_with_reset_bootstrap_state_test fails

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10097:
---

 Summary: Windows dtest 3.0: 
bootstrap_test.py:TestBootstrap.bootstrap_with_reset_bootstrap_state_test fails
 Key: CASSANDRA-10097
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10097
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
 Fix For: 3.0.x


{noformat}
  File "D:\Python27\lib\unittest\case.py", line 329, in run
testMethod()
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\tools.py", line 
243, in wrapped
f(obj)
  File 
"D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra-dtest\bootstrap_test.py",
 line 184, in bootstrap_with_reset_bootstrap_state_test
node3.watch_log_for("Resetting bootstrap progress to start fresh", 
from_mark=mark)
  File "build\bdist.win-amd64\egg\ccmlib\node.py", line 382, in watch_log_for
raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " [" 
+ self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + reads)
{noformat}

Failure history: 
[consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/bootstrap_test/TestBootstrap/bootstrap_with_reset_bootstrap_state_test/history/]

Env: both ci and locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-17 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9917:
-
Reviewer: Aleksey Yeschenko  (was: Marcus Eriksson)

> MVs should validate gc grace seconds on the tables involved
> ---
>
> Key: CASSANDRA-9917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Paulo Motta
>  Labels: materializedviews
> Fix For: 3.0 beta 2
>
>
> For correctness reasons (potential resurrection of dropped values), batchlog 
> entries are TTLs with the lowest gc grace second of all the tables involved 
> in a batch.
> It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
> will be dead on arrival, and never replayed.
> We should probably warn against such LOGGED writes taking place, in general, 
> but for MVs, we must validate that gc gs on the base table (and on the MV 
> table, if we should allow altering gc gs there at all), is never set too low, 
> or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10096) SerializationHelper should provide a rewindable in-order tester

2015-08-17 Thread Benedict (JIRA)
Benedict created CASSANDRA-10096:


 Summary: SerializationHelper should provide a rewindable in-order 
tester
 Key: CASSANDRA-10096
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10096
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.x


When deserializing a row we perform a logarithmic lookup on column name for 
every cell. There is also a lot of unnecessary indirection to reach this method 
call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9922) Add Materialized View WHERE schema support

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699880#comment-14699880
 ] 

Aleksey Yeschenko commented on CASSANDRA-9922:
--

Basically, {{WHERE}} support will be a new feature, and only new nodes will be 
able to support it. Same way with 3.0 and materialized views - can't properly 
use them until your whole cluster is on 3.0.

So putting this in 3.0.0 would be nice, and would be my preference - if we find 
time, but if not, I'm ready to deal with schema-level ugliness necessary later 
in 3.x.

> Add Materialized View WHERE schema support
> --
>
> Key: CASSANDRA-9922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: materializedviews
> Fix For: 3.x
>
>
> In order to provide forward compatibility with the 3.x series, we should add 
> schema support for capturing the where clause of the MV.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9901) Make AbstractType.isByteOrderComparable abstract

2015-08-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699879#comment-14699879
 ] 

Sylvain Lebresne commented on CASSANDRA-9901:
-

bq. One of the advantages of using an enum that I did not enumerate was the 
possibility of performing more efficient despatch for "mixed" clustering data

Alright, fair enough. I guess one of the thing that put me off is that 
{{COMPARE_COMPARABLE}} sounds really weird/unintuitive to me. If we go with the 
{{compareValue}} change I discuss above, then I'd suggest renaming the enum to 
something like:
{noformat}
enum ComparisonType { UNCOMPARABLE, BYTE_ORDER, CUSTOM }
{noformat}
and the {{compareValue}} is discussed above could be instead 
{{compareCustom()}}. We'd also wouldn't need {{isByteOrderComparable}} I 
believe since it would be directly handled by the {{compare}} (which won't be a 
virtual call).

bq. Admittedly I haven't confirmed this, but it looks fine to me

I read that too quickly and missed the package check, my bad. I guess it's note 
entirely full-proof, but we probably can't do much better short of having a 
white-list which would be ugly so I'm fine with that.

bq. I prefer to log more often than less, since there's more chance of it being 
spotted. I don't think we rebuild so often - just during schema changes, no?

{{rebuild}} happens every time a {{CFMetaData}} is created and validated, which 
means at least on every startup and multiple times per schema change (since 
it's called during validation), and that's not counting the case I forget.

A bit of context is also that I strongly suspect that while there is likely 
people already using custom types, I don't think there is all that many created 
new custom types now that we provide a relatively rich amount of types out of 
the box (that was not always the case). So that I'm nore worried about annoying 
people that have existing custom types, for which the message is basically 
useless since it's not currently actionable and they can't miss the change 
anyway since they'll have to update their own code.

In fact, I'm not even really sure a warning is necessary in the first place. As 
said, for people already having a custom type, the warning is mostly annoyance. 
 And for new user than might decide to write a custom type, I think being extra 
clear on the javadoc of the {{compareCustom()}} method that you should not 
implement it in newly created types would be fair enough warnings (we can 
additional add to the {{AbstractType}} javadoc that creating custom subclasses 
is frown upon nowadays).


> Make AbstractType.isByteOrderComparable abstract
> 
>
> Key: CASSANDRA-9901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9901
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.0 beta 2
>
>
> I can't recall _precisely_ what was agreed at the NGCC, but I'm reasonably 
> sure we agreed to make this method abstract, put some javadoc explaining we 
> may require fields to yield true in the near future, and potentially log a 
> warning on startup if a user-defined type returns false.
> This should make it into 3.0, IMO, so that we can look into migrating to 
> byte-order comparable types in the post-3.0 world.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9922) Add Materialized View WHERE schema support

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699871#comment-14699871
 ] 

Aleksey Yeschenko commented on CASSANDRA-9922:
--

Right, but it's not just about schema. If an older node doesn't have the code 
to handle {{WHERE}}, it won't be able to support those queries either way - 
even if there is schema support for it.

> Add Materialized View WHERE schema support
> --
>
> Key: CASSANDRA-9922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: materializedviews
> Fix For: 3.x
>
>
> In order to provide forward compatibility with the 3.x series, we should add 
> schema support for capturing the where clause of the MV.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9857) Deal with backward compatibilty issue of broken AbstractBounds serialization

2015-08-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9857:
--
Fix Version/s: (was: 3.0.0 rc1)
   3.0 beta 2

> Deal with backward compatibilty issue of broken AbstractBounds serialization
> 
>
> Key: CASSANDRA-9857
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9857
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 3.0 beta 2
>
>
> This ticket is related to CASSANDRA-9856 and CASSANDRA-9775. Even if the 
> broken/incomplete serialization of {{AbstractBounds}} is not a problem per-se 
> for pre-3.0 versions, it's still a problem for trunk and even though it's 
> fixed by CASSANDRA-9775 for 3.0 nodes, it might be a problem for 3.0 nodes 
> talking to older nodes.
> As the paging tests where those that exposed the problem (on trunk) in the 
> first place, it would be nice to modify said paging tests to work on mixed 
> version clustering so we can valid if it is a problem. If it is, then we'll 
> probably have to add redundant checks on trunk so they ignore anything the 
> 3.0 node sends incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9922) Add Materialized View WHERE schema support

2015-08-17 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699844#comment-14699844
 ] 

Carl Yeksigian commented on CASSANDRA-9922:
---

In order to actually support using the WHERE clause, we'll need to make sure 
that the nodes have been upgraded, otherwise old nodes won't be processing 
mutations in the same way as upgraded nodes. If this does slip, we'll already 
be preventing using WHERE clauses in the case we haven't upgraded all the nodes 
to support it.

> Add Materialized View WHERE schema support
> --
>
> Key: CASSANDRA-9922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: materializedviews
> Fix For: 3.x
>
>
> In order to provide forward compatibility with the 3.x series, we should add 
> schema support for capturing the where clause of the MV.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10076) Windows dtest 2.2: thrift_hsha_test.py:ThriftHSHATest.test_6285

2015-08-17 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-10076:

Assignee: Paulo Motta
Reviewer: Joshua McKenzie

> Windows dtest 2.2: thrift_hsha_test.py:ThriftHSHATest.test_6285
> ---
>
> Key: CASSANDRA-10076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10076
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>Assignee: Paulo Motta
>  Labels: Windows
> Fix For: 2.2.x
>
>
> [Error 
> text|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_dtest_win32/61/testReport/thrift_hsha_test/ThriftHSHATest/test_6285/]:
> {noformat}
> Unexpected error in node1 node log: ['ERROR 
> [MessagingService-Outgoing-/127.0.0.2] 2015-08-13 18:27:05,264 
> OutboundTcpConnection.java:318 - error writing to /127.0.0.2 
> java.lang.RuntimeException: java.io.IOException: An established connection 
> was aborted by the software in your host machine \tat 
> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:85) 
> ~[main/:na] \tat 
> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:70)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:286)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:272)
>  ~[main/:na] \tat 
> org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:125) 
> ~[main/:na] \tat 
> org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:335)
>  [main/:na] \tat 
> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:287)
>  [main/:na] \tat 
> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:221)
>  [main/:na] Caused by: java.io.IOException: An established connection was 
> aborted by the software in your host machine \tat 
> sun.nio.ch.SocketDispatcher.write0(Native Method) ~[na:1.8.0_51] \tat 
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51) ~[na:1.8.0_51] 
> \tat sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_51] 
> \tat sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_51] \tat 
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_51] 
> \tat 
> {noformat}
> [Failure 
> History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_dtest_win32/61/testReport/thrift_hsha_test/ThriftHSHATest/test_6285/history/]
>  (flaky)
> Env: CI only. Passes locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10068) Batchlog replay fails with exception after a node is decommissioned

2015-08-17 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10068:
--
Fix Version/s: (was: 3.0.0 rc1)
   3.0 beta 2

> Batchlog replay fails with exception after a node is decommissioned
> ---
>
> Key: CASSANDRA-10068
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10068
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Marcus Eriksson
> Fix For: 3.0 beta 2
>
> Attachments: n1.log, n2.log, n3.log, n4.log, n5.log
>
>
> This issue is reproducible through a Jepsen test of materialized views that 
> crashes and decommissions nodes throughout the test.
> At the conclusion of the test, a batchlog replay is initiated through 
> nodetool and hits the following assertion due to a missing host ID: 
> https://github.com/apache/cassandra/blob/3413e557b95d9448b0311954e9b4f53eaf4758cd/src/java/org/apache/cassandra/service/StorageProxy.java#L1197
> A nodetool status on the node with failed batchlog replay shows the following 
> entry for the decommissioned node:
> DN  10.0.0.5  ?  256  ?   null
>   rack1
> On the unaffected nodes, there is no entry for the decommissioned node as 
> expected.
> There are occasional hits of the same assertions for logs in other nodes; it 
> looks like the issue might occasionally resolve itself, but one node seems to 
> have the errant null entry indefinitely.
> In logs for the nodes, this possibly unrelated exception also appears:
> java.lang.RuntimeException: Trying to get the view natural endpoint on a 
> non-data replica
>   at 
> org.apache.cassandra.db.view.MaterializedViewUtils.getViewNaturalEndpoint(MaterializedViewUtils.java:91)
>  ~[apache-cassandra-3.0.0-alpha1-SNAPSHOT.jar:3.0.0-alpha1-SNAPSHOT]
> I have a running cluster with the issue on my machine; it is also repeatable.
> Nothing stands out in the logs of the decommissioned node (n4) for me. The 
> logs of each node in the cluster are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10095) Fix dtests on 3.0 branch on Windows

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10095:
---

 Summary: Fix dtests on 3.0 branch on Windows
 Key: CASSANDRA-10095
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10095
 Project: Cassandra
  Issue Type: Bug
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
 Fix For: 3.0.x


Parent ticket to track subtasks for dtest failures on Windows on the 3.0 branch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9922) Add Materialized View WHERE schema support

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699829#comment-14699829
 ] 

Aleksey Yeschenko commented on CASSANDRA-9922:
--

Would be nice to do this in rc1, but we really can add it later, even though it 
will be more painful. Not too painful.

> Add Materialized View WHERE schema support
> --
>
> Key: CASSANDRA-9922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: materializedviews
> Fix For: 3.x
>
>
> In order to provide forward compatibility with the 3.x series, we should add 
> schema support for capturing the where clause of the MV.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9922) Add Materialized View WHERE schema support

2015-08-17 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9922:
-
Fix Version/s: (was: 3.0.0 rc1)
   3.x

> Add Materialized View WHERE schema support
> --
>
> Key: CASSANDRA-9922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: materializedviews
> Fix For: 3.x
>
>
> In order to provide forward compatibility with the 3.x series, we should add 
> schema support for capturing the where clause of the MV.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699822#comment-14699822
 ] 

Ariel Weisberg edited comment on CASSANDRA-9749 at 8/17/15 5:06 PM:


I remember that conversation as well. Part of that was making an effort to 
distinguish between expected failures (end of log) and unexpected ones. I think 
there is going to be some pain there because it's really hard to tell the two 
apart.

I am not dead set against stop as a default behavior for replay, I just don't 
think linking the two settings together -isn't- is a good idea.

Extra negatives were not intentional.


was (Author: aweisberg):
I remember that conversation as well. Part of that was making an effort to 
distinguish between expected failures (end of log) and unexpected ones. I think 
there is going to be some pain there because it's really hard to tell the two 
apart.

I am not dead set against stop as a default behavior for replay, I just don't 
think linking the two settings together -isn't- a good idea.

Extra negatives were not intentional.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699822#comment-14699822
 ] 

Ariel Weisberg edited comment on CASSANDRA-9749 at 8/17/15 5:06 PM:


I remember that conversation as well. Part of that was making an effort to 
distinguish between expected failures (end of log) and unexpected ones. I think 
there is going to be some pain there because it's really hard to tell the two 
apart.

I am not dead set against stop as a default behavior for replay, I just don't 
think linking the two settings together -isn't- a good idea.

Extra negatives were not intentional.


was (Author: aweisberg):
I remember that conversation as well. Part of that was making an effort to 
distinguish between expected failures (end of log) and unexpected ones. I think 
there is going to be some pain there because it's really hard to tell the two 
apart.

I am not dead set against stop as a default behavior for replay, I just don't 
think linking the two settings together isn't a good idea.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699826#comment-14699826
 ] 

Benedict commented on CASSANDRA-9749:
-

bq. I just don't think linking the two settings together isn't a good idea.

That was too many negatives for me to parse (and be confident you'd typed 
correctly) :)

I'll note FTR I don't (and didn't) have a strong position on this.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9505) Expose sparse formatting via JMX and/or sstablemetadata

2015-08-17 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-9505.
--
Resolution: Not A Problem

Agreed. This is a non-issue.

> Expose sparse formatting via JMX and/or sstablemetadata
> ---
>
> Key: CASSANDRA-9505
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9505
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jim Witschey
> Fix For: 3.0.0 rc1
>
>
> It'd be helpful for us in TE if we could differentiate between data written 
> in the sparse and dense formats as described 
> [here|https://github.com/pcmanus/cassandra/blob/8099/guide_8099.md#storage-format-on-disk-and-on-wire].
>  It'd help us to measure speed and space performance and to make sure the 
> format is chosen correctly and consistently.
> I don't know if this would be best exposed through a JMX endpoint, 
> {{sstablemetadata}}, or both, but those seem like the most obvious exposure 
> points.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699822#comment-14699822
 ] 

Ariel Weisberg commented on CASSANDRA-9749:
---

I remember that conversation as well. Part of that was making an effort to 
distinguish between expected failures (end of log) and unexpected ones. I think 
there is going to be some pain there because it's really hard to tell the two 
apart.

I am not dead set against stop as a default behavior for replay, I just don't 
think linking the two settings together isn't a good idea.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699812#comment-14699812
 ] 

Jonathan Ellis commented on CASSANDRA-9892:
---

Let's push this to 3.2 rather than feature creeping 3.0.

> Add support for unsandboxed UDF
> ---
>
> Key: CASSANDRA-9892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> From discussion on CASSANDRA-9402,
> The approach postgresql takes is to distinguish between "trusted" (sandboxed) 
> and "untrusted" (anything goes) UDF languages. 
> Creating an untrusted language always requires superuser mode. Once that is 
> done, creating functions in it requires nothing special.
> Personally I would be fine with this approach, but I think it would be more 
> useful to have the extra permission on creating the function, and also 
> wouldn't require adding explicit CREATE LANGUAGE.
> So I'd suggest just providing different CQL permissions for trusted and 
> untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
> create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
> UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-08-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9892:
--
Fix Version/s: (was: 3.0.0 rc1)
   3.x

> Add support for unsandboxed UDF
> ---
>
> Key: CASSANDRA-9892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> From discussion on CASSANDRA-9402,
> The approach postgresql takes is to distinguish between "trusted" (sandboxed) 
> and "untrusted" (anything goes) UDF languages. 
> Creating an untrusted language always requires superuser mode. Once that is 
> done, creating functions in it requires nothing special.
> Personally I would be fine with this approach, but I think it would be more 
> useful to have the extra permission on creating the function, and also 
> wouldn't require adding explicit CREATE LANGUAGE.
> So I'd suggest just providing different CQL permissions for trusted and 
> untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
> create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
> UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10093) Invalid internal query for static compact tables

2015-08-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699808#comment-14699808
 ] 

Aleksey Yeschenko commented on CASSANDRA-10093:
---

+1 so long as cassci is happy.

> Invalid internal query for static compact tables
> 
>
> Key: CASSANDRA-10093
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10093
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 3.0 beta 1
>
>
> When dealing with static compact table on the CQL side and we do a {{SELECT * 
> FROM table;}} query, we generate the wrong clustering filter. More precisely, 
> we create a name query that selects the {{EMPTY}} clustering, but that's an 
> invalid clustering since static compact table have 1 clustering column 
> (internally at least). What we really want to query is the static parts.
> This is the reason for the failure of some dtests 
> ({{bootstrap_test:TestBootstrap.read_from_bootstrapped_node_test}} for 
> instance). More precisely, the invalid filter created breaks serialization, 
> which is why this is only really a problem on multi-node tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699803#comment-14699803
 ] 

Benedict commented on CASSANDRA-9749:
-

I cannot find where the conversation happened, so perhaps it was on IRC, but 
the consensus had shifted since we last discussed this over a year ago. There 
was wide support for failing on startup if the commit log is corrupted, and 
printing an error message for the user to opt into continuing in the face of 
those errors. iirc, [~aweisberg], [~bdeggleston] and [~jjordan] were 
participants, amongst others, so perhaps they can corroborate this since I 
cannot find a reference.

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9901) Make AbstractType.isByteOrderComparable abstract

2015-08-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699798#comment-14699798
 ] 

Benedict commented on CASSANDRA-9901:
-

bq. What I could suggest, on top of

If we were to say "instead of" and we stuck with the enum, I'd be with you. One 
of the advantages of using an enum that I did not enumerate was the possibility 
of performing more efficient despatch for "mixed" clustering data (i.e. with 
some byte comparable, some not), especially given we now always consult the 
boolean parameter from the class property, since the comparisons of each 
clustering column is performed in a different method call now (so if we have a 
different class property to consult as cheaply, we may as well do so). 

Having two virtual invocations instead of one on this path is a pretty 
significant burden we should avoid, however.

bq. From a quick look, it looks like the patch will log warning for every 
internal type that is not byte comparable

{code}
+if 
(!getClass().getPackage().equals(AbstractType.class.getPackage()))
+logger.warn("Type " + this + " is not comparable by its 
unsigned sequence of raw bytes. A future (major) release of Cassandra may 
remove support for such arbitrary comparisons, however upgrade steps will be 
provided to ensure a smooth transition.");
{code}

Admittedly I haven't confirmed this, but it looks fine to me, and I'll double 
check before we commit. 

bq.  Also, logging in CFMetadaData.rebuild is going to be more noisy than 
necessary 

I prefer to log more often than less, since there's more chance of it being 
spotted. I don't think we rebuild _so_ often - just during schema changes, no?

> Make AbstractType.isByteOrderComparable abstract
> 
>
> Key: CASSANDRA-9901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9901
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.0.0 rc1
>
>
> I can't recall _precisely_ what was agreed at the NGCC, but I'm reasonably 
> sure we agreed to make this method abstract, put some javadoc explaining we 
> may require fields to yield true in the near future, and potentially log a 
> warning on startup if a user-defined type returns false.
> This should make it into 3.0, IMO, so that we can look into migrating to 
> byte-order comparable types in the post-3.0 world.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699792#comment-14699792
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

Assuming we apply disk_failure_policy to the corrupt xlog, then if we've 
started up despite that then either policy was ignore, or user manually moved 
the xlog and restarted without it.  So IMO we should always move them aside 
since user either explicitly or implicitly wants that.

> Simplify (and unify) cleanup of compaction leftovers
> 
>
> Key: CASSANDRA-7066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>Priority: Minor
>  Labels: benedict-to-commit, compaction
> Fix For: 3.0 alpha 1
>
> Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699778#comment-14699778
 ] 

Jonathan Ellis commented on CASSANDRA-9749:
---

bq. Due to CASSANDRA-8515, the effective commit log failure policy in 3.0+ at 
time of replay is always 'die'.

Hmm, was that intended? /cc [~pauloricardomg] [~benedict]

> CommitLogReplayer continues startup after encountering errors
> -
>
> Key: CASSANDRA-9749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Branimir Lambov
> Fix For: 2.2.x
>
> Attachments: 9749-coverage.tgz
>
>
> There are a few places where the commit log recovery method either skips 
> sections or just returns when it encounters errors.
> Specifically if it can't read the header here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
> Or if there are compressor problems here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
>  and here: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
> Whether these are user-fixable or not, I think we should require more direct 
> user intervention (ie: fix what's wrong, or remove the bad file and restart) 
> since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9974) Improve debuggability

2015-08-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9974:
--
Fix Version/s: 3.x

> Improve debuggability
> -
>
> Key: CASSANDRA-9974
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9974
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.x
>
>
> While 8099 has brought a number of improvements, currently it is making 
> debugging a bit of a nightmare (for me at least). This slows down development 
> and test resolution, and so we should fix it sooner than later. This ticket 
> is intended to aggregate tickets that will improve this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9901) Make AbstractType.isByteOrderComparable abstract

2015-08-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699767#comment-14699767
 ] 

Sylvain Lebresne commented on CASSANDRA-9901:
-

>From a quick look, it looks like the patch will log warning for every internal 
>type that is not byte comparable, which is not what we want. Also, logging in 
>{{CFMetadaData.rebuild}} is going to be more noisy than necessary since that's 
>called reasonably often. Ideally we'd want to only warn when the type is used 
>in the first place.

On a less important note, but I'm not a fan of using an enum. I'm not convinced 
it'll add clarity for the user, but on the other side, we don't validate that 
what the enum said is consistent with what the compare method does which feels 
error prone to me. I also find it more clunky (than just making 
{{isByteOrderComparable}} abstract) but that's probably more a question of 
personal taste.

What I could suggest, on top of making {{isByteOrderComparable}} abstract, is 
to create some {{compareValue()}} (or some other name) that would be the 
existing {{compare()}}, and the {{compare()}} we actually used would basically 
be:
{noformat}
public final int compare(ByteBuffer b1, ByteBuffer b2)
{
return isByteBufferComparable() ? ByteBufferUtil.compareUnsigned(b1, b2) : 
compareValue(b1, b2);
}
{noformat}
And {{compareValue}} would be abstract, throwing 
{{UnsupportedOperationException}} by default, and only the implementations that 
are not bytes comparable would have to provide it.



> Make AbstractType.isByteOrderComparable abstract
> 
>
> Key: CASSANDRA-9901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9901
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.0.0 rc1
>
>
> I can't recall _precisely_ what was agreed at the NGCC, but I'm reasonably 
> sure we agreed to make this method abstract, put some javadoc explaining we 
> may require fields to yield true in the near future, and potentially log a 
> warning on startup if a user-defined type returns false.
> This should make it into 3.0, IMO, so that we can look into migrating to 
> byte-order comparable types in the post-3.0 world.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9414) Windows utest 2.2: org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty intermittent failure

2015-08-17 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-9414:
---
Reviewer: Paulo Motta

> Windows utest 2.2: org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty 
> intermittent failure
> --
>
> Key: CASSANDRA-9414
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9414
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Failure is intermittent enough that bisect is proving to be more hassle than 
> it's worth. Seems pretty consistent in CI.
> {noformat}
> [junit] Testcase: 
> testDeleteIfNotDirty(org.apache.cassandra.db.CommitLogTest):  Caused an 
> ERROR
> [junit] java.nio.file.AccessDeniedException: 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] FSWriteError in 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131)
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:148)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager.recycleSegment(CommitLogSegmentManager.java:360)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:166)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.startUnsafe(CommitLog.java:416)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.resetUnsafe(CommitLog.java:389)
> [junit] at 
> org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:178)
> [junit] Caused by: java.nio.file.AccessDeniedException: 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] at 
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
> [junit] at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
> [junit] at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
> [junit] at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
> [junit] at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> [junit] at java.nio.file.Files.delete(Files.java:1126)
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-08-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699744#comment-14699744
 ] 

Benedict commented on CASSANDRA-7066:
-

bq. So the scenario is, we crash hard AND suffer xlog corruption so we don't 
know which sstables are in-progress?

Right.

bq. (Is offline scrub xlog-aware? It probably should be.)

It is, but it hard fails on encountering a corrupted txn log; the operator can 
then manually delete that log if they so desire (or move it aside, stash it, 
whatever)

What about the sstables though? Right now we just leave them all there, but the 
last "new" file may be partially written, which will end up crashing some read 
queries. So the question is if we just fail and alert the user, or if we try to 
establish that this is the case and stash those that are corrupted, or if we 
just always move them aside.


> Simplify (and unify) cleanup of compaction leftovers
> 
>
> Key: CASSANDRA-7066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>Priority: Minor
>  Labels: benedict-to-commit, compaction
> Fix For: 3.0 alpha 1
>
> Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10094) Windows utest 2.2: testCommitLogFailureBeforeInitialization_mustKillJVM failure

2015-08-17 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-10094:
---

 Summary: Windows utest 2.2: 
testCommitLogFailureBeforeInitialization_mustKillJVM failure
 Key: CASSANDRA-10094
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10094
 Project: Cassandra
  Issue Type: Bug
Reporter: Joshua McKenzie
Assignee: Paulo Motta
 Fix For: 2.2.x


Error:
{noformat}
junit.framework.AssertionFailedError: 
at 
org.apache.cassandra.db.CommitLogFailurePolicyTest.testCommitLogFailureBeforeInitialization_mustKillJVM(CommitLogFailurePolicyTest.java:149)
{noformat}

[Failure 
History|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogFailurePolicyTest/testCommitLogFailureBeforeInitialization_mustKillJVM/history/]:
  Consistent since build #85

Env: CI only. Cannot repro locally



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9683) Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7

2015-08-17 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9683:
--
Fix Version/s: (was: 2.1.x)
   2.0.17
   2.1.9

> Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 
> 2.1.7
> --
>
> Key: CASSANDRA-9683
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9683
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 12.04 (3.13 Kernel) * 3
> JDK: Oracle JDK 7
> RAM: 32GB
> Cores 4 (+4 HT)
>Reporter: Loic Lambiel
>Assignee: Ariel Weisberg
> Fix For: 2.1.9
>
> Attachments: cassandra-env.sh, cassandra.yaml, cfstats.txt, 
> os_load.png, pending_compactions.png, read_latency.png, schema.txt, 
> system.log, write_latency.png
>
>
> After upgrading our cassandra staging cluster version from 2.1.6 to 2.1.7, 
> the average load grows from 0.1-0.3 to 1.8.
> Latencies did increase as well.
> We see an increase of pending compactions, probably due to CASSANDRA-9592.
> This cluster has almost no workload (staging environment)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9683) Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7

2015-08-17 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9683:
--
Fix Version/s: (was: 2.0.17)

> Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 
> 2.1.7
> --
>
> Key: CASSANDRA-9683
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9683
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 12.04 (3.13 Kernel) * 3
> JDK: Oracle JDK 7
> RAM: 32GB
> Cores 4 (+4 HT)
>Reporter: Loic Lambiel
>Assignee: Ariel Weisberg
> Fix For: 2.1.9
>
> Attachments: cassandra-env.sh, cassandra.yaml, cfstats.txt, 
> os_load.png, pending_compactions.png, read_latency.png, schema.txt, 
> system.log, write_latency.png
>
>
> After upgrading our cassandra staging cluster version from 2.1.6 to 2.1.7, 
> the average load grows from 0.1-0.3 to 1.8.
> Latencies did increase as well.
> We see an increase of pending compactions, probably due to CASSANDRA-9592.
> This cluster has almost no workload (staging environment)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF

2015-08-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699733#comment-14699733
 ] 

Benedict commented on CASSANDRA-10084:
--

bq. A whole new column family should work, right?

Yes, absolutely.

bq. would upgrading to 3.0 fix this?

I would be very surprised if it didn't. I won't promise, as there are a lot of 
unknowns, but given my assumptions about the problem, and the changes to 3.0: 
yes.

bq. how soon is that?

Not long, but I'd rather not give you our targets in case we slip :)

> Very slow performance streaming a large query from a single CF
> --
>
> Key: CASSANDRA-10084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10084
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.8
> 12GB EC2 instance
> 12 node cluster
> 32 concurrent reads
> 32 concurrent writes
> 6GB heap space
>Reporter: Brent Haines
> Attachments: cassandra.yaml
>
>
> We have a relatively simple column family that we use to track event data 
> from different providers. We have been utilizing it for some time. Here is 
> what it looks like: 
> {code}
> CREATE TABLE data.stories_by_text (
> ref_id timeuuid,
> second_type text,
> second_value text,
> object_type text,
> field_name text,
> value text,
> story_id timeuuid,
> data map,
> PRIMARY KEY ((ref_id, second_type, second_value, object_type, 
> field_name), value, story_id)
> ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = 'Searchable fields and actions in a story are indexed by 
> ref id which corresponds to a brand, app, app instance, or user.'
> AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 
> 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> We will, on a daily basis pull a query of the complete data for a given 
> index, it will look like this: 
> {code}
> select * from stories_by_text where ref_id = 
> f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value 
> = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail';
> {code}
> In the past, we have been able to pull millions of records out of the CF in a 
> few seconds. We recently added the data column so that we could filter on 
> event data and provide more detailed analysis of activity for our reports. 
> The data map, declared with 'data map' is very small; only 2 or 3 
> name/value pairs.
> Since we have added this column, our streaming query performance has gone 
> straight to hell. I just ran the above query and it took 46 minutes to read 
> 86K rows and then it timed out.
> I am uncertain what other data you need to see in order to diagnose this. We 
> are using STCS and are considering a change to Leveled Compaction. The table 
> is repaired nightly and the updates, which are at a very fast clip will only 
> impact the partition key for today, while the queries are for previous days 
> only. 
> To my knowledge these queries no longer finish ever. They time out, even 
> though I put a 60 second timeout on the read for the cluster. I can watch it 
> pause for 30 to 50 seconds many times during the stream. 
> Again, this only started happening when we added the data column.
> Please let me know what else you need for this. It is having a very big 
> impact on our system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10056) Fix AggregationTest post-test error messages

2015-08-17 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699732#comment-14699732
 ] 

Benjamin Lerer commented on CASSANDRA-10056:


LGTM

> Fix AggregationTest post-test error messages
> 
>
> Key: CASSANDRA-10056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10056
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Trivial
> Fix For: 2.2.x
>
>
> AggregationTest prints error messages after test execution since some UDT 
> cannot be dropped. It's not critical to the tests themselves but makes the 
> log cleaner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9906) get_slice and multiget_slice failing on trunk

2015-08-17 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699729#comment-14699729
 ] 

Benjamin Lerer commented on CASSANDRA-9906:
---

I will have another look at the patch as this change is probably wrong. The 
original {{ColumnFilter}} was not working properly and the thrift call was not 
returning anything. I changed it to make it work but I did not realize that I 
was returning more data than expected. 

> get_slice and multiget_slice failing on trunk
> -
>
> Key: CASSANDRA-9906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9906
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Adamson
>Assignee: Benjamin Lerer
>Priority: Blocker
> Fix For: 3.0.0 rc1
>
> Attachments: 9906.txt, dtest-CASSANDRA-9906.txt
>
>
> {{get_slice}} and {{multiget_slice}} are failing on trunk with the following 
> error:
> {noformat}
> java.lang.AssertionError: null
>   at 
> org.apache.cassandra.db.filter.ClusteringIndexNamesFilter.(ClusteringIndexNamesFilter.java:53)
>  ~[cassandra-all-3.0.0.592.jar:3.0.0.592]
>   at 
> org.apache.cassandra.thrift.CassandraServer.toInternalFilter(CassandraServer.java:405)
>  ~[cassandra-all-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at 
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:547)
>  ~[cassandra-all-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at 
> org.apache.cassandra.thrift.CassandraServer.multiget_slice(CassandraServer.java:348)
>  ~[cassandra-all-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$multiget_slice.getResult(Cassandra.java:3716)
>  ~[cassandra-thrift-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$multiget_slice.getResult(Cassandra.java:3700)
>  ~[cassandra-thrift-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[libthrift-0.9.2.jar:0.9.2]
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[libthrift-0.9.2.jar:0.9.2]
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:204)
>  ~[cassandra-all-3.0.0.592.jar:5.0.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]
> {noformat}
> The schema used for this was
> {noformat}
> create table test (k int, v int, primary key(k)) with compact storage;
> {noformat}
> and the code used for the call was
> {noformat}
> SlicePredicate predicate = new SlicePredicate();
> predicate.column_names = 
> Collections.singletonList(ByteBufferUtil.bytes("v"));
> client.multiget_slice(Collections.singletonList(key), new 
> ColumnParent("test"), predicate, ConsistencyLevel.ONE);
> {noformat}
> The error is coming from this line in {{ClusteringIndexNamesFilter}}
> {noformat}
> assert !clusterings.contains(Clustering.STATIC_CLUSTERING);
> {noformat}
> which is failing the assertion because column 'v' is static.
> Apologies for the line mismatches in {{ClusteringIndexNamesFilter}} I had 
> some debug statements in the code to help track down what was happening



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-08-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699727#comment-14699727
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

bq. we can (and probably will) leave incomplete sstables

So the scenario is, we crash hard AND suffer xlog corruption so we don't know 
which sstables are in-progress?

I don't think any operator will realistically be able to do anything useful 
with a xlog file that C* can't read.  On the other hand, it could help prove or 
disprove that it was actual corruption and not a C* bug.  So on balance I would 
lean towards stashing it.

(Is offline scrub xlog-aware?  It probably should be.)

> Simplify (and unify) cleanup of compaction leftovers
> 
>
> Key: CASSANDRA-7066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>Priority: Minor
>  Labels: benedict-to-commit, compaction
> Fix For: 3.0 alpha 1
>
> Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9414) Windows utest 2.2: org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty intermittent failure

2015-08-17 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699717#comment-14699717
 ] 

Joshua McKenzie commented on CASSANDRA-9414:


Going to hijack this ticket (since it's quite possibly the descendant of the 
original flaky test failure).

Error:
{noformat}
Error Message

java.nio.file.AccessDeniedException: 
build\test\cassandra\commitlog;69\CommitLog-5-1439816200722.log
Stacktrace

FSWriteError in build\test\cassandra\commitlog;69\CommitLog-5-1439816200722.log
at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:132)
at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:149)
at 
org.apache.cassandra.db.commitlog.CommitLogSegmentManager.recycleSegment(CommitLogSegmentManager.java:359)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:167)
at 
org.apache.cassandra.db.commitlog.CommitLog.startUnsafe(CommitLog.java:439)
at 
org.apache.cassandra.db.commitlog.CommitLog.resetUnsafe(CommitLog.java:412)
at 
org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:186)
Caused by: java.nio.file.AccessDeniedException: 
build\test\cassandra\commitlog;69\CommitLog-5-1439816200722.log
at 
sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
at 
sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at 
sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
at 
sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
at 
sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
at java.nio.file.Files.delete(Files.java:1126)
at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126)
{noformat}
Consistency: 
[Flaky|http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_utest_win32/lastCompletedBuild/testReport/org.apache.cassandra.db/CommitLogTest/testDeleteIfNotDirty/history/]

Env: CI only. Cannot repro locally.

> Windows utest 2.2: org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty 
> intermittent failure
> --
>
> Key: CASSANDRA-9414
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9414
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>  Labels: Windows
> Fix For: 2.2.x
>
>
> Failure is intermittent enough that bisect is proving to be more hassle than 
> it's worth. Seems pretty consistent in CI.
> {noformat}
> [junit] Testcase: 
> testDeleteIfNotDirty(org.apache.cassandra.db.CommitLogTest):  Caused an 
> ERROR
> [junit] java.nio.file.AccessDeniedException: 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] FSWriteError in 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:131)
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:148)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManager.recycleSegment(CommitLogSegmentManager.java:360)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:166)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.startUnsafe(CommitLog.java:416)
> [junit] at 
> org.apache.cassandra.db.commitlog.CommitLog.resetUnsafe(CommitLog.java:389)
> [junit] at 
> org.apache.cassandra.db.CommitLogTest.testDeleteIfNotDirty(CommitLogTest.java:178)
> [junit] Caused by: java.nio.file.AccessDeniedException: 
> build\test\cassandra\commitlog;0\CommitLog-5-1431965988394.log
> [junit] at 
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
> [junit] at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
> [junit] at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
> [junit] at 
> sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
> [junit] at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> [junit] at java.nio.file.Files.delete(Files.java:1126)
> [junit] at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10084) Very slow performance streaming a large query from a single CF

2015-08-17 Thread Brent Haines (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699716#comment-14699716
 ] 

Brent Haines commented on CASSANDRA-10084:
--

Ah. I was afraid of that. We'd probably create a new table with the desired 
format, direct our processing to the new table and write a storm topology to 
migrate data over. A whole new column family should work, right? 

I'll try to capture the profile today. This is on a large cluster, but if I set 
the fetch size up high I should be able to keep the query on a single box long 
enough to capture data.

Appreciate the help. If we can mitigate this to a reasonable point, would 
upgrading to 3.0 fix this? It would be favorable to keep things the way they 
are, muddle through it and then upgrade when the time comes (how soon is that?) 
and live happily ever after. ;)

> Very slow performance streaming a large query from a single CF
> --
>
> Key: CASSANDRA-10084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10084
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.8
> 12GB EC2 instance
> 12 node cluster
> 32 concurrent reads
> 32 concurrent writes
> 6GB heap space
>Reporter: Brent Haines
> Attachments: cassandra.yaml
>
>
> We have a relatively simple column family that we use to track event data 
> from different providers. We have been utilizing it for some time. Here is 
> what it looks like: 
> {code}
> CREATE TABLE data.stories_by_text (
> ref_id timeuuid,
> second_type text,
> second_value text,
> object_type text,
> field_name text,
> value text,
> story_id timeuuid,
> data map,
> PRIMARY KEY ((ref_id, second_type, second_value, object_type, 
> field_name), value, story_id)
> ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = 'Searchable fields and actions in a story are indexed by 
> ref id which corresponds to a brand, app, app instance, or user.'
> AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', 
> 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> We will, on a daily basis pull a query of the complete data for a given 
> index, it will look like this: 
> {code}
> select * from stories_by_text where ref_id = 
> f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value 
> = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail';
> {code}
> In the past, we have been able to pull millions of records out of the CF in a 
> few seconds. We recently added the data column so that we could filter on 
> event data and provide more detailed analysis of activity for our reports. 
> The data map, declared with 'data map' is very small; only 2 or 3 
> name/value pairs.
> Since we have added this column, our streaming query performance has gone 
> straight to hell. I just ran the above query and it took 46 minutes to read 
> 86K rows and then it timed out.
> I am uncertain what other data you need to see in order to diagnose this. We 
> are using STCS and are considering a change to Leveled Compaction. The table 
> is repaired nightly and the updates, which are at a very fast clip will only 
> impact the partition key for today, while the queries are for previous days 
> only. 
> To my knowledge these queries no longer finish ever. They time out, even 
> though I put a 60 second timeout on the read for the cluster. I can watch it 
> pause for 30 to 50 seconds many times during the stream. 
> Again, this only started happening when we added the data column.
> Please let me know what else you need for this. It is having a very big 
> impact on our system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >