date:20150612


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583148#comment-14583148
 ] 

Sylvain Lebresne commented on CASSANDRA-9424:
-

bq. Is it possible to make official way(API) to load schema offline? That is, 
the ability to read schema from stored SSTables without waking up unnecessary 
server components.

I agree we should get this ultimately. What I'd suggest is to serialize the 
schema as a sstable metadata component (only the table the sstable is of of 
course). This would be useful for offline tools, but I've wanted that for 
debugging more than once too. So I went ahead and created CASSANDRA-9587.

 3.X Schema Improvements
 ---

 Key: CASSANDRA-9424
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9424
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
 Fix For: 3.x


 C* schema code is both more brittle and less efficient than I'd like it to 
 be. This ticket will aggregate the improvement tickets to go into 3.X and 4.X 
 to improve the situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9587) Serialize table schema as a sstable component

Sylvain Lebresne created CASSANDRA-9587:
---

 Summary: Serialize table schema as a sstable component
 Key: CASSANDRA-9587
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Sylvain Lebresne


Having the schema with each sstable would be tremendously useful for offline 
tools and for debugging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9586) ant eclipse-warnings fails in trunk


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583155#comment-14583155
 ] 

Stefania commented on CASSANDRA-9586:
-

It's a false warning because it is released in close(). 

Making the wrapper auto closeable does not fix it. Passing in a channel that is 
closed in a try and taking a new reference in the constructor fixes it however.

 ant eclipse-warnings fails in trunk
 ---

 Key: CASSANDRA-9586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9586
 Project: Cassandra
  Issue Type: Bug
Reporter: Michael Shuler
Assignee: Stefania
 Fix For: 3.x


 {noformat}
 eclipse-warnings:
 [mkdir] Created dir: /home/mshuler/git/cassandra/build/ecj
  [echo] Running Eclipse Code Analysis.  Output logged to 
 /home/mshuler/git/cassandra/build/ecj/eclipse_compiler_checks.txt
  [java] incorrect classpath: 
 /home/mshuler/git/cassandra/build/cobertura/classes
  [java] --
  [java] 1. ERROR in 
 /home/mshuler/git/cassandra/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
  (at line 81)
  [java] super(new ChannelProxy(file), DEFAULT_BUFFER_SIZE, -1L, 
 BufferType.OFF_HEAP);
  [java]   ^^
  [java] Potential resource leak: 'unassigned Closeable value' may not 
 be closed
  [java] --
  [java] 1 problem (1 error)
 BUILD FAILED
 {noformat}
 (checked 2.2 and did not find this issue)
 git blame on line 81 shows commit 17dd4cc for CASSANDRA-8897



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9588) Make sstableofflinerelevel print stats before relevel

2015-06-12 Thread Jens Rantil (JIRA)

Jens Rantil created CASSANDRA-9588:
--

 Summary: Make sstableofflinerelevel print stats before relevel
 Key: CASSANDRA-9588
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9588
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jens Rantil
Priority: Trivial


The current version of sstableofflinerelevel prints the new level hierarchy. 
While nodetool cfstats ... will tell the current hierarchy it would be nice 
to have sstableofflinerelevel output the current level histograms for easy 
comparison of what changes will be made. Especially since sstableofflinerelevel 
needs to run when node isn't running and nodetool cfstats ... doesn't work 
because of that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9582) MarshalException after upgrading to 2.1.6

2015-06-12 Thread Tom van den Berge (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583088#comment-14583088
 ] 

Tom van den Berge commented on CASSANDRA-9582:
--

{code}
 keyspace_name | columnfamily_name | column_name| component_index | 
index_name | index_options | index_type | type   | validator
---+---++-++---+++-
 drillster |   InvoiceItem |column1 |   0 | 
  null |  null |   null | clustering_key |
org.apache.cassandra.db.marshal.UUIDType
 drillster |   InvoiceItem |   currencyCode |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.UTF8Type
 drillster |   InvoiceItem |description |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.UTF8Type
 drillster |   InvoiceItem |key |null | 
  null |  null |   null |  partition_key |   
org.apache.cassandra.db.marshal.BytesType
 drillster |   InvoiceItem | priceGross |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.LongType
 drillster |   InvoiceItem |  priceNett |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.LongType
 drillster |   InvoiceItem |   quantity |null | 
  null |  null |   null |regular | 
org.apache.cassandra.db.marshal.IntegerType
 drillster |   InvoiceItem |sku |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.UTF8Type
 drillster |   InvoiceItem | unitPriceGross |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.LongType
 drillster |   InvoiceItem |  unitPriceNett |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.LongType
 drillster |   InvoiceItem |vat |null | 
  null |  null |   null |regular |
org.apache.cassandra.db.marshal.LongType
 drillster |   InvoiceItem | vatRateBasisPoints |null | 
  null |  null |   null |regular | 
org.apache.cassandra.db.marshal.IntegerType
{code}

{code}
 keyspace_name | columnfamily_name | bloom_filter_fp_chance | caching   | 
column_aliases | comment | compaction_strategy_class
   | compaction_strategy_options | comparator   
| compression_parameters | default_time_to_live | default_validator 
| dropped_columns | gc_grace_seconds | index_interval | 
is_dense | key_aliases | key_validator | 
local_read_repair_chance | max_compaction_threshold | 
memtable_flush_period_in_ms | min_compaction_threshold | 
populate_io_cache_on_flush | read_repair_chance | replicate_on_write | 
speculative_retry | subcomparator| type  | 
value_alias
---+---++---++-+-+-+--++--+---+-+--++--+-+---+--+--+-+--++++---+--+---+-
 drillster |   InvoiceItem |   null | KEYS_ONLY |   
  [] | | 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy |   
   {} | org.apache.cassandra.db.marshal.UUIDType | 
{} |0 | org.apache.cassandra.db.marshal.BytesType | 
   null |   864000 |128 |False |  [] | 
org.apache.cassandra.db.marshal.BytesType |0 |  
 32 |   0 |4 |  
False |  1 |   True |
99.0PERCENTILE | org.apache.cassandra.db.marshal.UTF8Type | Super |null
{code}

 MarshalException

[jira] [Commented] (CASSANDRA-9587) Serialize table schema as a sstable component


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583172#comment-14583172
 ] 

Aleksey Yeschenko commented on CASSANDRA-9587:
--

You don't mean the full schema, right? Only the schema for the table, and 
dependent user types?

 Serialize table schema as a sstable component
 -

 Key: CASSANDRA-9587
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Sylvain Lebresne
 Fix For: 3.x


 Having the schema with each sstable would be tremendously useful for offline 
 tools and for debugging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107
 ] 

Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:43 AM:
--

The dtests are ready for review, I've already created a pull request: 
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time 
uuid, which I had incorrectly set to request.now in 
ModificationStatement.casInternal(). I fixed it so that it should always be 
bigger than the timestamp returned by QueryState. SP.beginAndRepairPaxos() does 
something similar, but it doesn't look 100% correct to me. It might still fail 
under heavy load, what do you think?

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder, 
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as 
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) 
 

Inside this new folder I created these sub-folders: 
- _operations_, for statements
- _entities_, for collections, secondary index, various types 
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of 
something else do tell, we could perhaps move them somewhere else entirely.



was (Author: stefania):
The dtests are ready for review, I've already created a pull request: 
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time 
uuid, which I had incorrectly set to request.now in 
ModificationStatement.casInternal(). I fixed it so that it should always be 
bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does 
something similar, but it doesn't look 100% correct to me. It might still fail 
under heavy load, what do you think?

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder, 
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as 
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) 
 

Inside this new folder I created these sub-folders: 
- _operations_, for statements
- _entities_, for collections, secondary index, various types 
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of 
something else do tell, we could perhaps move them somewhere else entirely.


 Migrate CQL dtests to unit tests
 

 Key: CASSANDRA-9160
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160
 Project: Cassandra
  Issue Type: Test
Reporter: Sylvain Lebresne
Assignee: Stefania

 We have CQL tests in 2 places: dtests and unit tests. The unit tests are 
 actually somewhat better in the sense that they have the ability to test both 
 prepared and unprepared statements at the flip of a switch. It's also better 
 to have all those tests in the same place so we can improve the test 
 framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we 
 should move the CQL dtests to the unit tests (which will be a good occasion 
 to organize them better).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107
 ] 

Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:43 AM:
--

The dtests are ready for review, I've already created a pull request: 
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time 
uuid, which I had incorrectly set to request.now in 
ModificationStatement.casInternal(). I fixed it so that it should always be 
bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does 
something similar, but it doesn't look 100% correct to me. It might still fail 
under heavy load, what do you think?

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder, 
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as 
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) 
 

Inside this new folder I created these sub-folders: 
- _operations_, for statements
- _entities_, for collections, secondary index, various types 
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of 
something else do tell, we could perhaps move them somewhere else entirely.



was (Author: stefania):
The dtests are ready for review, I've already created a pull request: 
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time 
uuid, which I had incorrectly set to request.now in 
ModiciationStatement.casInternal(). I fixed it so that it should always be 
bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does 
something similar, but it doesn't look 100% correct to me. It might still fail 
under heavy load, what do you think?

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder, 
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as 
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) 
 

Inside this new folder I created these sub-folders: 
- _operations_, for statements
- _entities_, for collections, secondary index, various types 
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of 
something else do tell, we could perhaps move them somewhere else entirely.


 Migrate CQL dtests to unit tests
 

 Key: CASSANDRA-9160
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160
 Project: Cassandra
  Issue Type: Test
Reporter: Sylvain Lebresne
Assignee: Stefania

 We have CQL tests in 2 places: dtests and unit tests. The unit tests are 
 actually somewhat better in the sense that they have the ability to test both 
 prepared and unprepared statements at the flip of a switch. It's also better 
 to have all those tests in the same place so we can improve the test 
 framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we 
 should move the CQL dtests to the unit tests (which will be a good occasion 
 to organize them better).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9160) Migrate CQL dtests to unit tests

[
https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107
]

Stefania edited comment on CASSANDRA-9160 at 6/12/15 7:44 AM:
--

The dtests are ready for review, I've already created a pull request:
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time
uuid, which I had incorrectly set to request.now in
ModificationStatement.casInternal(). I fixed it so that it should always be
bigger than the timestamp returned by QueryState. SP.beginAndRepairPaxos() does
something similar, but it doesn't look 100% correct to me. It might still fail
under heavy load, what do you think?

In order to rearrange tests, I had to move {{CQLTester}} based tests into a
separate folder, _validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1)

Inside this new folder I created these sub-folders:
- _operations_, for statements
- _entities_, for collections, secondary index, various types
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of
something else do tell, we could perhaps move them somewhere else entirely.

was (Author: stefania):
The dtests are ready for review, I've already created a pull request:
https://github.com/riptano/cassandra-dtest/pull/321.

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder,
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1)

I am not too happy with the _validation_ folder so if you can think of
something else do tell, we could perhaps move them somewhere else entirely.

Migrate CQL dtests to unit tests

Key: CASSANDRA-9160
URL: https://issues.apache.org/jira/browse/CASSANDRA-9160
Project: Cassandra
Issue Type: Test
Reporter: Sylvain Lebresne
Assignee: Stefania

We have CQL tests in 2 places: dtests and unit tests. The unit tests are
actually somewhat better in the sense that they have the ability to test both
prepared and unprepared statements at the flip of a switch. It's also better
to have all those tests in the same place so we can improve the test
framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we
should move the CQL dtests to the unit tests (which will be a good occasion
to organize them better).

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9160) Migrate CQL dtests to unit tests


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583107#comment-14583107
 ] 

Stefania commented on CASSANDRA-9160:
-

The dtests are ready for review, I've already created a pull request: 
https://github.com/riptano/cassandra-dtest/pull/321.

The problem with the cas unit tests failure was because of the CAS ballot time 
uuid, which I had incorrectly set to request.now in 
ModiciationStatement.casInternal(). I fixed it so that it should always be 
bigger than the timestamp returned by QueryState..SP.beginAndRepairPaxos() does 
something similar, but it doesn't look 100% correct to me. It might still fail 
under heavy load, what do you think?

A tentative rearrangement, pending CI:

I had to move all {{CQLTester}} based tests into a separate folder, 
_validation_, to distinguish the CQL tests from the following:
- tests based on {{SchemaLoader}}, occupying file names that we needed, such as 
BatchTests or DeleteTest
- unit tests for Java classes (e.g. cql3/statements/SelectStatementTest in 2.1) 
 

Inside this new folder I created these sub-folders: 
- _operations_, for statements
- _entities_, for collections, secondary index, various types 
- _util_, to host CQLTester
- _miscellaneous_, for everything else.

I am not too happy with the _validation_ folder so if you can think of 
something else do tell, we could perhaps move them somewhere else entirely.


 Migrate CQL dtests to unit tests
 

 Key: CASSANDRA-9160
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9160
 Project: Cassandra
  Issue Type: Test
Reporter: Sylvain Lebresne
Assignee: Stefania

 We have CQL tests in 2 places: dtests and unit tests. The unit tests are 
 actually somewhat better in the sense that they have the ability to test both 
 prepared and unprepared statements at the flip of a switch. It's also better 
 to have all those tests in the same place so we can improve the test 
 framework in only one place (CASSANDRA-7959, CASSANDRA-9159, etc...). So we 
 should move the CQL dtests to the unit tests (which will be a good occasion 
 to organize them better).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9589) Unclear difference between Improvement and Wish in JIRA

2015-06-12 Thread Jens Rantil (JIRA)

Jens Rantil created CASSANDRA-9589:
--

 Summary: Unclear difference between Improvement and Wish in 
JIRA
 Key: CASSANDRA-9589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9589
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website, Tools
Reporter: Jens Rantil
Priority: Trivial


The JIRA issue types Wish and Improvement sounds the same to me. Every time 
I have no idea which of them I should choose. Filing this bug to 1) get clarity 
and 2) propose either one of them is merged into the other or 3) rename them to 
make it clear why they differ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: ninja suppresswarnings

2015-06-12 Thread benedict

Repository: cassandra
Updated Branches:
  refs/heads/trunk 887bbc141 - b1abcd048


ninja suppresswarnings


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b1abcd04
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b1abcd04
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b1abcd04

Branch: refs/heads/trunk
Commit: b1abcd048e3780f11256e455e6024dcf05887f71
Parents: 887bbc1
Author: Benedict Elliott Smith bened...@apache.org
Authored: Fri Jun 12 11:58:42 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Jun 12 11:58:42 2015 +0100

--
 src/java/org/apache/cassandra/io/util/RandomAccessReader.java | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b1abcd04/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
--
diff --git a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java 
b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
index fef206e..c4be8e9 100644
--- a/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
+++ b/src/java/org/apache/cassandra/io/util/RandomAccessReader.java
@@ -76,6 +76,7 @@ public class RandomAccessReader extends AbstractDataInput 
implements FileDataInp
 // not have a shared channel.
 private static class RandomAccessReaderWithChannel extends 
RandomAccessReader
 {
+@SuppressWarnings(resource)
 RandomAccessReaderWithChannel(File file)
 {
 super(new ChannelProxy(file), DEFAULT_BUFFER_SIZE, -1L, 
BufferType.OFF_HEAP);

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

[
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583262#comment-14583262
]

Benedict commented on CASSANDRA-8099:
-

I've pushed a small semantic-changing suggestion for serialization and merging
of RTs [here|https://github.com/belliottsmith/cassandra/tree/8099-RTMarker]

I'm happy to split this (and further changes) out into a separate ticket, but
while this does cross the threshold for discussion/mention, it's actually a
pretty small/contained change. Basically, on a RT boundary, instead of issuing
a close _and_ open marker, we just issue the new open marker - both during
merge and serialization. On read, encountering an open marker when we _already_
have one open for that iterator is treated is a close/open pair. This both
reduces storage on disk, especially for large records (where RT markers are
both more frequent and, obviously, larger), but also gets rid of the
UnfilteredRowIterators.MergedUnfiltered ugliness.

Refactor and modernize the storage engine
-

Key: CASSANDRA-8099
URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 3.0 beta 1

Attachments: 8099-nit

The current storage engine (which for this ticket I'll loosely define as the
code implementing the read/write path) is suffering from old age. One of the
main problem is that the only structure it deals with is the cell, which
completely ignores the more high level CQL structure that groups cell into
(CQL) rows.
This leads to many inefficiencies, like the fact that during a reads we have
to group cells multiple times (to count on replica, then to count on the
coordinator, then to produce the CQL resultset) because we forget about the
grouping right away each time (so lots of useless cell names comparisons in
particular). But outside inefficiencies, having to manually recreate the CQL
structure every time we need it for something is hindering new features and
makes the code more complex that it should be.
Said storage engine also has tons of technical debt. To pick an example, the
fact that during range queries we update {{SliceQueryFilter.count}} is pretty
hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has
to go into to simply remove the last query result.
So I want to bite the bullet and modernize this storage engine. I propose to
do 2 main things:
# Make the storage engine more aware of the CQL structure. In practice,
instead of having partitions be a simple iterable map of cells, it should be
an iterable list of row (each being itself composed of per-column cells,
though obviously not exactly the same kind of cell we have today).
# Make the engine more iterative. What I mean here is that in the read path,
we end up reading all cells in memory (we put them in a ColumnFamily object),
but there is really no reason to. If instead we were working with iterators
all the way through, we could get to a point where we're basically
transferring data from disk to the network, and we should be able to reduce
GC substantially.
Please note that such refactor should provide some performance improvements
right off the bat but it's not it's primary goal either. It's primary goal is
to simplify the storage engine and adds abstraction that are better suited to
further optimizations.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9587) Serialize table schema as a sstable component


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583278#comment-14583278
 ] 

Sylvain Lebresne commented on CASSANDRA-9587:
-

Yes, I mean only the information relevant to the sstable.

 Serialize table schema as a sstable component
 -

 Key: CASSANDRA-9587
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9587
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Sylvain Lebresne
 Fix For: 3.x


 Having the schema with each sstable would be tremendously useful for offline 
 tools and for debugging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9581) pig-tests spend time waiting on /dev/random for SecureRandom


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583327#comment-14583327
 ] 

Joshua McKenzie commented on CASSANDRA-9581:


Welp - that's totally on me for not checking those 2 links. Sorry about that 
and thanks for confirming. :)

 pig-tests spend time waiting on /dev/random for SecureRandom
 

 Key: CASSANDRA-9581
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9581
 Project: Cassandra
  Issue Type: Test
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg

 We don't need secure random numbers (for unit tests) so waiting for entropy 
 doesn't make much sense. Luckily Java makes it easy to point to /dev/urandom 
 for entropy. It also transparently handles it correctly on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections

2015-06-12 Thread Stefan Podkowinski (JIRA)

Stefan Podkowinski created CASSANDRA-9590:
-

 Summary: Support for both encrypted and unencrypted native 
transport connections
 Key: CASSANDRA-9590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stefan Podkowinski


Enabling encryption for native transport currently turns SSL exclusively on or 
off for the opened socket. Migrating from plain to encrypted requires to 
migrate all native clients as well and redeploy all of them at the same time 
after starting the SSL enabled Cassandra nodes. 

This patch would allow to start Cassandra with both an unencrypted and ssl 
enabled native port. Clients can connect to either, based whether they support 
ssl or not.

This has been implemented by introducing a new {{native_transport_port_ssl}} 
config option. 
There would be three scenarios:
* client encryption disabled: native_transport_port unencrypted, port_ssl not 
used
* client encryption enabled, port_ssl not set: encrypted native_transport_port
* client encryption enabled and port_ssl set: native_transport_port 
unencrypted, port_ssl encrypted

This approach would keep configuration behavior fully backwards compatible.

Patch proposal (tests will be added later in case people will speak out in 
favor for the patch):
[Diff 
trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl],
 
[Patch against 
trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows

2015-06-12 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583297#comment-14583297
 ] 

Marcus Eriksson commented on CASSANDRA-9045:


been looking at this again today, but have to say that I have no idea what is 
going on, not able to reproduce

could you post your current schema; (describe table bounces;) and logs between 
2015-06-04T11:31:38 and 2015-06-08T08:27:36 for the nodes involved in your last 
example? Could you also run tools/bin/sstablemetadata over the sstables on one 
of those nodes? Just to check that the timestamps look ok.



 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.x

 Attachments: 9045-debug-tracing.txt, another.txt, 
 apache-cassandra-2.0.13-SNAPSHOT.jar, cqlsh.txt, debug.txt, inconsistency.txt


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584019#comment-14584019
 ] 

Joshua McKenzie commented on CASSANDRA-7918:


Fair point. My initial thought was 1 arg to name the output file (i.e. name for 
the test you're doing) and the rest passed through to stress, but as you said 
it's not a pressing question.

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9567) Windows does not handle ipv6 addresses


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9567:
---
  Reviewer: Joshua McKenzie
Attachment: 9567.txt

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x

 Attachments: 9567.txt


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9591:
-
Fix Version/s: (was: 2.0.15)
   2.0.x

 Scrub (recover) sstables even when -Index.db is missing
 ---

 Key: CASSANDRA-9591
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck
Assignee: mck
  Labels: sstablescrub
 Fix For: 2.0.x

 Attachments: 9591-2.0.txt


 Today SSTableReader needs at minimum 3 files to load an sstable:
  - -Data.db
  - -CompressionInfo.db 
  - -Index.db
 But during the scrub process the -Index.db file isn't actually necessary, 
 unless there's corruption in the -Data.db and we want to be able to skip over 
 corrupted rows. Given that there is still a fair chance that there's nothing 
 wrong with the -Data.db file and we're just missing the -Index.db file this 
 patch addresses that situation.
 So the following patch makes it possible for the StandaloneScrubber 
 (sstablescrub) to recover sstables despite missing -Index.db files.
 This can happen from a catastrophic incident where data directories have been 
 lost and/or corrupted, or wiped and the backup not healthy. I'm aware that 
 normally one depends on replicas or snapshots to avoid such situations, but 
 such catastrophic incidents do occur in the wild.
 I have not tested this patch against normal c* operations and all the other 
 (more critical) ways SSTableReader is used. i'll happily do that and add the 
 needed units tests if people see merit in accepting the patch.
 Otherwise the patch can live with the issue, in-case anyone else needs it. 
 There's also a cassandra distribution bundled with the patch 
 [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
  to make life a little easier for anyone finding themselves in such a bad 
 situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable

2015-06-12 Thread Richard Low (JIRA)

Richard Low created CASSANDRA-9596:
--

 Summary: Tombstone timestamps aren't used to skip SSTables while 
they are still in the memtable
 Key: CASSANDRA-9596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Richard Low
 Fix For: 2.0.x


If you have one SSTable containing a partition level tombstone at timestamp t 
and all other SSTables have cells with timestamp  t, Cassandra will skip all 
the other SSTables and return nothing quickly. However, if the partition 
tombstone is still in the memtable it doesn’t skip any SSTables. It should use 
the same timestamp logic to skip all SSTables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9567) Windows does not handle ipv6 addresses


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson reassigned CASSANDRA-9567:
--

Assignee: Philip Thompson  (was: Joshua McKenzie)

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583998#comment-14583998
 ] 

Benedict commented on CASSANDRA-8061:
-

[~rstrickland]: could you confirm you're definitely seeing these files stick 
around indefinitely? That would be an independent problem to the cfstats issue, 
which happens without this issue. CASSANDRA-9580 has already been posted for 
that, and a fix is available (although we may tweak the fix before release)

 tmplink files are not removed
 -

 Key: CASSANDRA-8061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux
Reporter: Gianluca Borello
Assignee: Joshua McKenzie
 Fix For: 2.1.x

 Attachments: 8061_v1.txt, 8248-thread_dump.txt


 After installing 2.1.0, I'm experiencing a bunch of tmplink files that are 
 filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 
 and that is very similar, and I confirm it happens both on 2.1.0 as well as 
 from the latest commit on the cassandra-2.1 branch 
 (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685
  from the cassandra-2.1)
 Even starting with a clean keyspace, after a few hours I get:
 {noformat}
 $ sudo find /raid0 | grep tmplink | xargs du -hs
 2.7G  
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db
 13M   
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db
 1.8G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db
 12M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db
 5.2M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db
 822M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db
 7.3M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db
 1.2G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db
 6.7M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db
 1.1G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db
 11M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db
 1.7G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db
 812K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db
 122M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db
 744K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db
 660K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db
 796K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db
 137M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db
 139M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db
 940K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db
 936K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db
 672K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db
 113M

[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed

2015-06-12 Thread Robbie Strickland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584007#comment-14584007
 ] 

Robbie Strickland commented on CASSANDRA-8061:
--

[~benedict] I can confirm that I made an invalid assumption that the issue was 
related, but in fact the files are transient and the issue is CASSANDRA-9850.

 tmplink files are not removed
 -

 Key: CASSANDRA-8061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux
Reporter: Gianluca Borello
Assignee: Joshua McKenzie
 Fix For: 2.1.x

 Attachments: 8061_v1.txt, 8248-thread_dump.txt


 After installing 2.1.0, I'm experiencing a bunch of tmplink files that are 
 filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 
 and that is very similar, and I confirm it happens both on 2.1.0 as well as 
 from the latest commit on the cassandra-2.1 branch 
 (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685
  from the cassandra-2.1)
 Even starting with a clean keyspace, after a few hours I get:
 {noformat}
 $ sudo find /raid0 | grep tmplink | xargs du -hs
 2.7G  
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db
 13M   
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db
 1.8G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db
 12M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db
 5.2M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db
 822M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db
 7.3M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db
 1.2G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db
 6.7M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db
 1.1G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db
 11M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db
 1.7G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db
 812K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db
 122M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db
 744K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db
 660K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db
 796K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db
 137M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db
 139M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db
 940K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db
 936K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db
 672K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db
 113M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Data.db
 116M

[jira] [Updated] (CASSANDRA-8831) Create a system table to expose prepared statements

2015-06-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8831:
--
Labels: client-impacting docs-impacting  (was: client-impacting 
doc-impacting)

 Create a system table to expose prepared statements
 ---

 Key: CASSANDRA-8831
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8831
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Robert Stupp
  Labels: client-impacting, docs-impacting
 Fix For: 3.x

 Attachments: 8831-3.0-v1.txt, 8831-v1.txt, 8831-v2.txt


 Because drivers abstract from users the handling of up/down nodes, they have 
 to deal with the fact that when a node is restarted (or join), it won't know 
 any prepared statement. Drivers could somewhat ignore that problem and wait 
 for a query to return an error (that the statement is unknown by the node) to 
 re-prepare the query on that node, but it's relatively inefficient because 
 every time a node comes back up, you'll get bad latency spikes due to some 
 queries first failing, then being re-prepared and then only being executed. 
 So instead, drivers (at least the java driver but I believe others do as 
 well) pro-actively re-prepare statements when a node comes up. It solves the 
 latency problem, but currently every driver instance blindly re-prepare all 
 statements, meaning that in a large cluster with many clients there is a lot 
 of duplication of work (it would be enough for a single client to prepare the 
 statements) and a bigger than necessary load on the node that started.
 An idea to solve this it to have a (cheap) way for clients to check if some 
 statements are prepared on the node. There is different options to provide 
 that but what I'd suggest is to add a system table to expose the (cached) 
 prepared statements because:
 # it's reasonably straightforward to implement: we just add a line to the 
 table when a statement is prepared and remove it when it's evicted (we 
 already have eviction listeners). We'd also truncate the table on startup but 
 that's easy enough). We can even switch it to a virtual table if/when 
 CASSANDRA-7622 lands but it's trivial to do with a normal table in the 
 meantime.
 # it doesn't require a change to the protocol or something like that. It 
 could even be done in 2.1 if we wish to.
 # exposing prepared statements feels like a genuinely useful information to 
 have (outside of the problem exposed here that is), if only for 
 debugging/educational purposes.
 The exposed table could look something like:
 {noformat}
 CREATE TABLE system.prepared_statements (
keyspace_name text,
table_name text,
prepared_id blob,
query_string text,
PRIMARY KEY (keyspace_name, table_name, prepared_id)
 )
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9229) Add functions to convert timeuuid to date or time

2015-06-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9229:
--
Labels: cql docs-impacting  (was: cql doc-impacting)

 Add functions to convert timeuuid to date or time
 -

 Key: CASSANDRA-9229
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9229
 Project: Cassandra
  Issue Type: New Feature
Reporter: Michaël Figuière
Assignee: Benjamin Lerer
  Labels: cql, docs-impacting
 Fix For: 2.2.0 rc2

 Attachments: 9229.txt, CASSANDRA-9229-V2.txt, CASSANDRA-9229.txt


 As CASSANDRA-7523 brings the {{date}} and {{time}} native types to Cassandra, 
 it would be useful to add builtin function to convert {{timeuuid}} to these 
 two new types, just like {{dateOf()}} is doing for timestamps.
 {{timeOf()}} would extract the time component from a {{timeuuid}}. Example 
 use case could be at insert time with for instance {{timeOf(now())}}, as well 
 as at read time to compare the time component of a {{timeuuid}} column in a 
 {{WHERE}} clause.
 The use cases would be similar for {{date}} but the solution is slightly less 
 obvious, as in a perfect world we would want {{dateOf()}} to convert to 
 {{date}} and {{timestampOf()}} for {{timestamp}}, unfortunately {{dateOf()}} 
 already exist and convert to a {{timestamp}}, not a {{date}}. Making this 
 change would break many existing CQL queries which is not acceptable. 
 Therefore we could use a different name formatting logic such as {{toDate}} 
 or {{dateFrom}}. We could then also consider using this new name convention 
 for the 3 dates related types and just have {{dateOf}} becoming a deprecated 
 alias.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9402) Implement proper sandboxing for UDFs

2015-06-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9402:
--
Labels: docs-impacting security  (was: doc-impacting security)

 Implement proper sandboxing for UDFs
 

 Key: CASSANDRA-9402
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9402
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
Assignee: Robert Stupp
Priority: Critical
  Labels: docs-impacting, security
 Fix For: 3.0 beta 1

 Attachments: 9402-warning.txt


 We want to avoid a security exploit for our users.  We need to make sure we 
 ship 2.2 UDFs with good defaults so someone exposing it to the internet 
 accidentally doesn't open themselves up to having arbitrary code run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9567) Windows does not handle ipv6 addresses


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989
 ] 

Philip Thompson commented on CASSANDRA-9567:


[~JoshuaMcKenzie], this fixes the issue I was running into.

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x

 Attachments: 9567.txt


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9567) Windows does not handle ipv6 addresses


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989
 ] 

Philip Thompson edited comment on CASSANDRA-9567 at 6/12/15 8:08 PM:
-

[~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the 
(valid) assumption that a yaml requires a space after the {{:}} between the key 
and value.


was (Author: philipthompson):
[~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the 
(valid) assumption that a yaml requires {{: }} between the key and value, not 
just a {{:}}

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x

 Attachments: 9567.txt


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9567) Windows does not handle ipv6 addresses


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583989#comment-14583989
 ] 

Philip Thompson edited comment on CASSANDRA-9567 at 6/12/15 8:08 PM:
-

[~JoshuaMcKenzie], this fixes the issue I was running into. It relies on the 
(valid) assumption that a yaml requires {{: }} between the key and value, not 
just a {{:}}


was (Author: philipthompson):
[~JoshuaMcKenzie], this fixes the issue I was running into.

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x

 Attachments: 9567.txt


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Kuris updated CASSANDRA-9526:
-
Attachment: (was: Monitor-Phi-JMX.patch.txt)

 Provide a JMX hook to monitor phi values in the FailureDetector
 ---

 Key: CASSANDRA-9526
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ron Kuris
 Fix For: 2.0.x


 phi_convict_threshold can be tuned, but there's currently no way to monitor 
 the phi values to see if you're getting close.
 The attached patch adds the ability to get these values via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks

2015-06-12 Thread Yuki Morishita (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-9592:
--
Reviewer: Yuki Morishita

 `Periodically attempt to submit background compaction tasks
 ---

 Key: CASSANDRA-9592
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.x


 There's more race conditions affecting compaction task submission than 
 CASSANDRA-7745, so to prevent some of these problems stalling compactions, I 
 propose simply submitting background compactions once every minute, if 
 possible. This will typically be a no-op, but there's no harm in that, since 
 it's very cheap to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks

2015-06-12 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584158#comment-14584158
 ] 

Yuki Morishita commented on CASSANDRA-9592:
---

* {{CompactionManager#submitBackground}} will not return null, so we need 
{{isEmpty}} check.
* I think this change makes compaction kick scheduled here 
(https://github.com/belliottsmith/cassandra/blob/d1ddae1b61a9ca037b5edc137b5c9915e86dece6/src/java/org/apache/cassandra/service/CassandraDaemon.java#L371-L386)
 obsolete so we can delete it.
* nit: compile error because of missed ;

 `Periodically attempt to submit background compaction tasks
 ---

 Key: CASSANDRA-9592
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.x


 There's more race conditions affecting compaction task submission than 
 CASSANDRA-7745, so to prevent some of these problems stalling compactions, I 
 propose simply submitting background compactions once every minute, if 
 possible. This will typically be a no-op, but there's no harm in that, since 
 it's very cheap to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed

2015-06-12 Thread Robbie Strickland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583927#comment-14583927
 ] 

Robbie Strickland commented on CASSANDRA-8061:
--

I can also verify I am seeing this after upgrading to 2.1.6.  It breaks 
nodetool cfstats with an AssertionError:

{noformat}
error: 
/var/lib/cassandra/xvdb/data/prod_analytics_events/locationupdateevents-52f73af0fd5111e489f75b9deb90b453/prod_analytics_events-locationupdateevents-tmplink-ka-1460-Data.db
-- StackTrace --
java.lang.AssertionError: 
/var/lib/cassandra/xvdb/data/prod_analytics_events/locationupdateevents-52f73af0fd5111e489f75b9deb90b453/prod_analytics_events-locationupdateevents-tmplink-ka-1460-Data.db
at 
org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:270)
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296)
at 
org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290)
at 
com.yammer.metrics.reporting.JmxReporter$Gauge.getValue(JmxReporter.java:63)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at 
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at 
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$241(TCPTransport.java:683)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$1/602091790.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

 tmplink files are not removed
 -

 Key: CASSANDRA-8061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux
Reporter: Gianluca Borello
Assignee: Joshua McKenzie
 Fix For: 2.1.x

 Attachments: 8061_v1.txt, 8248-thread_dump.txt


 After installing 2.1.0, I'm experiencing a bunch of tmplink files that are 
 filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 
 and that is very similar, and I confirm it

[jira] [Created] (CASSANDRA-9594) metrics reporter doesn't start until after a bootstrap

2015-06-12 Thread Eric Evans (JIRA)

Eric Evans created CASSANDRA-9594:
-

 Summary: metrics reporter doesn't start until after a bootstrap
 Key: CASSANDRA-9594
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9594
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Eric Evans
Priority: Minor


In {{o.a.c.service.CassandraDaemon#setup}}, the metrics reporter is started 
immediately after the invocation of 
{{o.a.c.service.StorageService#initServer}}, which for a bootstrapping node may 
block for a considerable period of time.  If the metrics reporter is your only 
source of visibility, then you are blind until the bootstrap completes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Kuris updated CASSANDRA-9526:
-
Attachment: Monitor-Phi-JMX.patch
Tiny-Race-Condition.patch
Phi-Log-Debug-When-Close.patch

Fixed some minor problems found while running this code at high volume for a 
while, so uploaded a revised patchset. Also corrected header reorganization.

 Provide a JMX hook to monitor phi values in the FailureDetector
 ---

 Key: CASSANDRA-9526
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ron Kuris
 Fix For: 2.0.x

 Attachments: Monitor-Phi-JMX.patch, Phi-Log-Debug-When-Close.patch, 
 Tiny-Race-Condition.patch


 phi_convict_threshold can be tuned, but there's currently no way to monitor 
 the phi values to see if you're getting close.
 The attached patch adds the ability to get these values via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

svn commit: r1685145 - /cassandra/site/publish/doc/cql3/CQL-2.2.html

Author: tylerhobbs
Date: Fri Jun 12 18:42:50 2015
New Revision: 1685145

URL: http://svn.apache.org/r1685145
Log:
Add collections, tuple,  UDT to JSON types documentation

Modified:
cassandra/site/publish/doc/cql3/CQL-2.2.html

Modified: cassandra/site/publish/doc/cql3/CQL-2.2.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/doc/cql3/CQL-2.2.html?rev=1685145r1=1685144r2=1685145view=diff
==
--- cassandra/site/publish/doc/cql3/CQL-2.2.html (original)
+++ cassandra/site/publish/doc/cql3/CQL-2.2.html Fri Jun 12 18:42:50 2015
@@ -1,4 +1,4 @@
-?xml version='1.0' encoding='utf-8' ?!DOCTYPE html PUBLIC -//W3C//DTD 
XHTML 1.0 Transitional//EN 
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd;html 
xmlns=http://www.w3.org/1999/xhtml;headmeta http-equiv=Content-Type 
content=text/html; charset=utf-8/titleCQL-2.2/title/headbodyplink 
rel=StyleSheet href=CQL.css type=text/css media=screen/ph1 
id=CassandraQueryLanguageCQLv3.3.0Cassandra Query Language (CQL) 
v3.3.0/h1span id=tableOfContentsol style=list-style: none;lia 
href=CQL-2.2.html#CassandraQueryLanguageCQLv3.3.0Cassandra Query Language 
(CQL) v3.3.0/aol style=list-style: none;lia 
href=CQL-2.2.html#CQLSyntaxCQL Syntax/aol style=list-style: 
none;lia href=CQL-2.2.html#PreamblePreamble/a/lilia 
href=CQL-2.2.html#ConventionsConventions/a/lilia 
href=CQL-2.2.html#identifiersIdentifiers and keywords/a/lilia 
href=CQL-2.2.html#constantsConstants/a/lilia href=CQL-2.
 2.html#CommentsComments/a/lilia 
href=CQL-2.2.html#statementsStatements/a/lilia 
href=CQL-2.2.html#preparedStatementPrepared 
Statement/a/li/ol/lilia href=CQL-2.2.html#dataDefinitionData 
Definition/aol style=list-style: none;lia 
href=CQL-2.2.html#createKeyspaceStmtCREATE KEYSPACE/a/lilia 
href=CQL-2.2.html#useStmtUSE/a/lilia 
href=CQL-2.2.html#alterKeyspaceStmtALTER KEYSPACE/a/lilia 
href=CQL-2.2.html#dropKeyspaceStmtDROP KEYSPACE/a/lilia 
href=CQL-2.2.html#createTableStmtCREATE TABLE/a/lilia 
href=CQL-2.2.html#alterTableStmtALTER TABLE/a/lilia 
href=CQL-2.2.html#dropTableStmtDROP TABLE/a/lilia 
href=CQL-2.2.html#truncateStmtTRUNCATE/a/lilia 
href=CQL-2.2.html#createIndexStmtCREATE INDEX/a/lilia 
href=CQL-2.2.html#dropIndexStmtDROP INDEX/a/lilia 
href=CQL-2.2.html#createTypeStmtCREATE TYPE/a/lilia 
href=CQL-2.2.html#alterTypeStmtALTER TYPE/
 a/lilia href=CQL-2.2.html#dropTypeStmtDROP TYPE/a/lilia 
href=CQL-2.2.html#createTriggerStmtCREATE TRIGGER/a/lilia 
href=CQL-2.2.html#dropTriggerStmtDROP TRIGGER/a/lilia 
href=CQL-2.2.html#createFunctionStmtCREATE FUNCTION/a/lilia 
href=CQL-2.2.html#dropFunctionStmtDROP FUNCTION/a/lilia 
href=CQL-2.2.html#createAggregateStmtCREATE AGGREGATE/a/lilia 
href=CQL-2.2.html#dropAggregateStmtDROP AGGREGATE/a/li/ol/lilia 
href=CQL-2.2.html#dataManipulationData Manipulation/aol 
style=list-style: none;lia 
href=CQL-2.2.html#insertStmtINSERT/a/lilia 
href=CQL-2.2.html#updateStmtUPDATE/a/lilia 
href=CQL-2.2.html#deleteStmtDELETE/a/lilia 
href=CQL-2.2.html#batchStmtBATCH/a/li/ol/lilia 
href=CQL-2.2.html#queriesQueries/aol style=list-style: none;lia 
href=CQL-2.2.html#selectStmtSELECT/a/li/ol/lilia 
href=CQL-2.2.html#databaseRolesDatabase Roles/a
 ol style=list-style: none;lia href=CQL-2.2.html#createRoleStmtCREATE 
ROLE/a/lilia href=CQL-2.2.html#alterRoleStmtALTER ROLE/a/lilia 
href=CQL-2.2.html#dropRoleStmtDROP ROLE/a/lilia 
href=CQL-2.2.html#grantRoleStmtGRANT ROLE/a/lilia 
href=CQL-2.2.html#revokeRoleStmtREVOKE ROLE/a/lilia 
href=CQL-2.2.html#createUserStmtCREATE USER /a/lilia 
href=CQL-2.2.html#alterUserStmtALTER USER /a/lilia 
href=CQL-2.2.html#dropUserStmtDROP USER /a/lilia 
href=CQL-2.2.html#listUsersStmtLIST USERS/a/li/ol/lilia 
href=CQL-2.2.html#dataControlData Control/aol style=list-style: 
none;lia href=CQL-2.2.html#permissionsPermissions /a/lilia 
href=CQL-2.2.html#grantPermissionsStmtGRANT PERMISSION/a/lilia 
href=CQL-2.2.html#revokePermissionsStmtREVOKE 
PERMISSION/a/li/ol/lilia href=CQL-2.2.html#typesData Types/aol 
style=list-style: none;lia href=CQL-2.2.html#usingti
 mestampsWorking with timestamps/a/lilia 
href=CQL-2.2.html#usingdatesWorking with dates/a/lilia 
href=CQL-2.2.html#usingtimeWorking with time/a/lilia 
href=CQL-2.2.html#countersCounters/a/lilia 
href=CQL-2.2.html#collectionsWorking with 
collections/a/li/ol/lilia 
href=CQL-2.2.html#functionsFunctions/aol style=list-style: none;lia 
href=CQL-2.2.html#tokenFunToken/a/lilia 
href=CQL-2.2.html#uuidFunUuid/a/lilia 
href=CQL-2.2.html#timeuuidFunTimeuuid functions/a/lilia 
href=CQL-2.2.html#blobFunBlob conversion functions/a/li/ol/lilia 
href=CQL-2.2.html#udfsUser-Defined Functions/a/lilia 
href=CQL-2.2.html#udasUser-Defined Aggregates/a/lilia 
href=CQL-2.2.html#jsonJSON Support/aol style=list-style: none;lia 
href=CQL-2.2.html#selectJsonSELECT JSON/a/lilia 
href=CQL-2.2.html#insertJsonINSERT JSON/a/lilia 
href=CQL-2.2.html#jsonEncodingJSON Enc
 oding of Cassandra Data Types/a/lilia

[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing

2015-06-12 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-9591:
---
Reviewer: Stefania

 Scrub (recover) sstables even when -Index.db is missing
 ---

 Key: CASSANDRA-9591
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck
Assignee: mck
  Labels: sstablescrub
 Fix For: 2.0.15

 Attachments: 9591-2.0.txt


 Today SSTableReader needs at minimum 3 files to load an sstable:
  - -Data.db
  - -CompressionInfo.db 
  - -Index.db
 But during the scrub process the -Index.db file isn't actually necessary, 
 unless there's corruption in the -Data.db and we want to be able to skip over 
 corrupted rows. Given that there is still a fair chance that there's nothing 
 wrong with the -Data.db file and we're just missing the -Index.db file this 
 patch addresses that situation.
 So the following patch makes it possible for the StandaloneScrubber 
 (sstablescrub) to recover sstables despite missing -Index.db files.
 This can happen from a catastrophic incident where data directories have been 
 lost and/or corrupted, or wiped and the backup not healthy. I'm aware that 
 normally one depends on replicas or snapshots to avoid such situations, but 
 such catastrophic incidents do occur in the wild.
 I have not tested this patch against normal c* operations and all the other 
 (more critical) ways SSTableReader is used. i'll happily do that and add the 
 needed units tests if people see merit in accepting the patch.
 Otherwise the patch can live with the issue, in-case anyone else needs it. 
 There's also a cassandra distribution bundled with the patch 
 [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
  to make life a little easier for anyone finding themselves in such a bad 
 situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583891#comment-14583891
 ] 

Joshua McKenzie commented on CASSANDRA-7918:


I think the general concern is that maintaining a code-base with gnuplot in it 
isn't something your fellow contributors are thrilled about, not the potential 
difficulty of a user interacting with it.

How about something like a verbose_stress.sh that dumps current commit sha, 
yaml settings, and stress args to a file, passes all args through to 
cassandra-stress.* and appends the stress output to that file, then compresses 
the final results to the an archive named w/datetime stamp? Some simple section 
delimiters and our graph generator could parse that trivially.

Avoids the coupling w/stress, keeps the collection of metadata and test output 
as a separate logical entity, and we get our canonical source of truth.

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583898#comment-14583898
 ] 

Benedict commented on CASSANDRA-7918:
-

bq. How about something like a verbose_stress.sh

SGTM. Although I'm not such a fan of datetime naming - they need to be 
prohibitively long to get uniqueness, and are really ugly to parse (mentally). 
Might prefer a mix of date + short ascii hash. Not exactly a pressing question 
though.

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Kuris updated CASSANDRA-9526:
-
Attachment: (was: PHI-Log-Debug-When-Close.patch.txt)

 Provide a JMX hook to monitor phi values in the FailureDetector
 ---

 Key: CASSANDRA-9526
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ron Kuris
 Fix For: 2.0.x


 phi_convict_threshold can be tuned, but there's currently no way to monitor 
 the phi values to see if you're getting close.
 The attached patch adds the ability to get these values via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9526) Provide a JMX hook to monitor phi values in the FailureDetector


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Kuris updated CASSANDRA-9526:
-
Attachment: (was: PHI-Race-Condition.patch.txt)

 Provide a JMX hook to monitor phi values in the FailureDetector
 ---

 Key: CASSANDRA-9526
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9526
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ron Kuris
 Fix For: 2.0.x


 phi_convict_threshold can be tuned, but there's currently no way to monitor 
 the phi values to see if you're getting close.
 The attached patch adds the ability to get these values via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

2015-06-12 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583935#comment-14583935
 ] 

Marcus Eriksson commented on CASSANDRA-8460:


bq. So my initial approach was to define a second config item, separate from 
data_file_directories
yeah lets keep it simple for now, add a new config variable like you suggest 

 Make it possible to move non-compacting sstables to slow/big storage in DTCS
 

 Key: CASSANDRA-8460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
  Labels: dtcs

 It would be nice if we could configure DTCS to have a set of extra data 
 directories where we move the sstables once they are older than 
 max_sstable_age_days. 
 This would enable users to have a quick, small SSD for hot, new data, and big 
 spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9567) Windows does not handle ipv6 addresses


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583984#comment-14583984
 ] 

Philip Thompson commented on CASSANDRA-9567:


[~kishkaru], can you test with this patch as well?

 Windows does not handle ipv6 addresses
 --

 Key: CASSANDRA-9567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9567
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Philip Thompson
 Fix For: 3.x, 2.1.x, 2.2.x

 Attachments: 9567.txt


 In cassandra.ps1, we are pulling the listen and rpc addresses from the yaml 
 by splitting on {{:}}, then selecting [1] from the resulting split, to 
 separate the yaml key from the value.
 Unfortunately, because ipv6 addresses contain {{:}} characters, this means we 
 are not grabbing the whole address, causing problems when starting the node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9595) Compacting an empty table sometimes doesn't delete SSTables in 2.0 with LCS

2015-06-12 Thread Jim Witschey (JIRA)

Jim Witschey created CASSANDRA-9595:
---

 Summary: Compacting an empty table sometimes doesn't delete 
SSTables in 2.0 with LCS
 Key: CASSANDRA-9595
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9595
 Project: Cassandra
  Issue Type: Bug
Reporter: Jim Witschey
 Fix For: 2.0.x


On 2.0, when compaction is run on a table with all rows deleted and configured 
with LCS, sometimes SSTables remain on disk afterwards. This causes one of our 
dtests to fail periodically, for instance 
[here|http://cassci.datastax.com/view/cassandra-2.0/job/cassandra-2.0_dtest/68/testReport/compaction_test/TestCompaction_with_LeveledCompactionStrategy/sstable_deletion_test/].
 This can be reproduced in dtests with

{code}
CASSANDRA_VERSION=git:cassandra-2.0 nosetests 
./compaction_test.py:TestCompaction_with_LeveledCompactionStrategy.sstable_deletion_test
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/2] cassandra git commit: Fix ipv6 parsing on Windows startup

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 a5be8f199 - 2e92cf899


Fix ipv6 parsing on Windows startup

Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b

Branch: refs/heads/cassandra-2.2
Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f
Parents: 69b7dd3
Author: Philip Thompson ptnapol...@gmail.com
Authored: Fri Jun 12 18:17:13 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:17:13 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1
--
diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1
index 80049ee..41ea7c1 100644
--- a/bin/cassandra.ps1
+++ b/bin/cassandra.ps1
@@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable
 {
 if ($line -match ^listen_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $listenAddress = $args[1] -replace  , 
 }
 if ($line -match ^rpc_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $rpcAddress = $args[1] -replace  , 
 }
 }

[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks

2015-06-12 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584182#comment-14584182
 ] 

Jeremiah Jordan commented on CASSANDRA-9592:


bq. I think this change makes compaction kick scheduled here obsolete so we can 
delete it.

We should make this new task every 5/10 minutes then so that we don't start 
compactions early.  The 5 minute window with no compactions is nice to have 
to giving an operator time to disable compaction over JMX or other such things. 
 So we shouldn't lower it down to only 1 minute.

 `Periodically attempt to submit background compaction tasks
 ---

 Key: CASSANDRA-9592
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.x


 There's more race conditions affecting compaction task submission than 
 CASSANDRA-7745, so to prevent some of these problems stalling compactions, I 
 propose simply submitting background compactions once every minute, if 
 possible. This will typically be a no-op, but there's no harm in that, since 
 it's very cheap to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2e92cf89
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2e92cf89
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2e92cf89

Branch: refs/heads/cassandra-2.2
Commit: 2e92cf8996f42dc40fade41b73001affdfbd6f7d
Parents: a5be8f1 c1702b0
Author: Josh McKenzie josh.mcken...@datastax.com
Authored: Fri Jun 12 18:18:45 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:18:45 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--

cassandra git commit: Fix ipv6 parsing on Windows startup

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 69b7dd327 - c1702b0b3


Fix ipv6 parsing on Windows startup

Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b

Branch: refs/heads/cassandra-2.1
Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f
Parents: 69b7dd3
Author: Philip Thompson ptnapol...@gmail.com
Authored: Fri Jun 12 18:17:13 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:17:13 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1
--
diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1
index 80049ee..41ea7c1 100644
--- a/bin/cassandra.ps1
+++ b/bin/cassandra.ps1
@@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable
 {
 if ($line -match ^listen_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $listenAddress = $args[1] -replace  , 
 }
 if ($line -match ^rpc_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $rpcAddress = $args[1] -replace  , 
 }
 }

[2/3] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2e92cf89
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2e92cf89
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2e92cf89

Branch: refs/heads/trunk
Commit: 2e92cf8996f42dc40fade41b73001affdfbd6f7d
Parents: a5be8f1 c1702b0
Author: Josh McKenzie josh.mcken...@datastax.com
Authored: Fri Jun 12 18:18:45 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:18:45 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--

[1/3] cassandra git commit: Fix ipv6 parsing on Windows startup

Repository: cassandra
Updated Branches:
  refs/heads/trunk 40c3e8922 - 7476d83b4


Fix ipv6 parsing on Windows startup

Patch by Philip Thompson; reviewed by jmckenzie for CASSANDRA-9567


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1702b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1702b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1702b0b

Branch: refs/heads/trunk
Commit: c1702b0b3c24040ed8b402e684a8d0ffd4e4359f
Parents: 69b7dd3
Author: Philip Thompson ptnapol...@gmail.com
Authored: Fri Jun 12 18:17:13 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:17:13 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c1702b0b/bin/cassandra.ps1
--
diff --git a/bin/cassandra.ps1 b/bin/cassandra.ps1
index 80049ee..41ea7c1 100644
--- a/bin/cassandra.ps1
+++ b/bin/cassandra.ps1
@@ -299,12 +299,12 @@ Function VerifyPortsAreAvailable
 {
 if ($line -match ^listen_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $listenAddress = $args[1] -replace  , 
 }
 if ($line -match ^rpc_address:)
 {
-$args = $line -Split :
+$args = $line -Split : 
 $rpcAddress = $args[1] -replace  , 
 }
 }

[3/3] cassandra git commit: Merge branch 'cassandra-2.2' into trunk

Merge branch 'cassandra-2.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7476d83b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7476d83b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7476d83b

Branch: refs/heads/trunk
Commit: 7476d83b44fa58a354fc0c7330b74b8e7ed7a3a3
Parents: 40c3e89 2e92cf8
Author: Josh McKenzie josh.mcken...@datastax.com
Authored: Fri Jun 12 18:19:06 2015 -0400
Committer: Josh McKenzie josh.mcken...@datastax.com
Committed: Fri Jun 12 18:19:06 2015 -0400

--
 bin/cassandra.ps1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--

[jira] [Comment Edited] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584246#comment-14584246
 ] 

Aleksey Yeschenko edited comment on CASSANDRA-9596 at 6/12/15 11:23 PM:


We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss 
{{collectAllData()}} there, though.

Benedict fixed it in CASSANDRA-9298 (2.1.6).


was (Author: iamaleksey):
We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss 
{{collectAllData()}} there, though.

Benedict fixed it in CASSANDRA-9228 (2.1.6).

 Tombstone timestamps aren't used to skip SSTables while they are still in the 
 memtable
 --

 Key: CASSANDRA-9596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Richard Low
 Fix For: 2.0.x


 If you have one SSTable containing a partition level tombstone at timestamp t 
 and all other SSTables have cells with timestamp  t, Cassandra will skip all 
 the other SSTables and return nothing quickly. However, if the partition 
 tombstone is still in the memtable it doesn’t skip any SSTables. It should 
 use the same timestamp logic to skip all SSTables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9576) Connection leak in CQLRecordWriter


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584285#comment-14584285
 ] 

Philip Thompson commented on CASSANDRA-9576:


[~beobal], could you check this out next week? I'm not seeing where we are 
leaking connections.

 Connection leak in CQLRecordWriter
 --

 Key: CASSANDRA-9576
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9576
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: T Meyarivan
Assignee: Philip Thompson

 Ran into connection leaks when using CQLCassandra 
 apache-cassandra-2.2.0-beta1-src + CQLOutputFormat (via CqlNativeStorage). 
 It seems like the order blocks of code starting at 
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlRecordWriter.java#L298
  were reversed in 2.2 which leads to the connection leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9596) Tombstone timestamps aren't used to skip SSTables while they are still in the memtable


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584246#comment-14584246
 ] 

Aleksey Yeschenko commented on CASSANDRA-9596:
--

We do for {{collectTimeOrderedData()}} since CASSANDRA-7394 (2.0.9). I did miss 
{{collectAllData()}} there, though.

Benedict fixed it in CASSANDRA-9228 (2.1.6).

 Tombstone timestamps aren't used to skip SSTables while they are still in the 
 memtable
 --

 Key: CASSANDRA-9596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Richard Low
 Fix For: 2.0.x


 If you have one SSTable containing a partition level tombstone at timestamp t 
 and all other SSTables have cells with timestamp  t, Cassandra will skip all 
 the other SSTables and return nothing quickly. However, if the partition 
 tombstone is still in the memtable it doesn’t skip any SSTables. It should 
 use the same timestamp logic to skip all SSTables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584250#comment-14584250
 ] 

Aleksey Yeschenko commented on CASSANDRA-9592:
--

bq. We should make this new task every 5/10 minutes then so that we don't start 
compactions early.

Make it every 1 minute, just delay it by 5?

 `Periodically attempt to submit background compaction tasks
 ---

 Key: CASSANDRA-9592
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.x


 There's more race conditions affecting compaction task submission than 
 CASSANDRA-7745, so to prevent some of these problems stalling compactions, I 
 propose simply submitting background compactions once every minute, if 
 possible. This will typically be a no-op, but there's no harm in that, since 
 it's very cheap to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9487) CommitLogTest hangs intermittently in 2.0

2015-06-12 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583664#comment-14583664
 ] 

Ariel Weisberg commented on CASSANDRA-9487:
---

Was this merged? Can it be closed?

 CommitLogTest hangs intermittently in 2.0
 -

 Key: CASSANDRA-9487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9487
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Michael Shuler
Assignee: Branimir Lambov
 Fix For: 2.0.x

 Attachments: system.log


 Possibly related to CASSANDRA-8992 ?
 2.0 unit tests are hanging periodically in the same way (I have not gone 
 through all the branches, so can't say we're in the clear everywhere - 
 marking for just 2.x at the moment). CommitLogTest hung system.log attached 
 from local reproduction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583465#comment-14583465
 ] 

Sylvain Lebresne commented on CASSANDRA-9572:
-

patch lgtm

 DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is 
 used.
 --

 Key: CASSANDRA-9572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Antti Nissinen
Assignee: Marcus Eriksson
  Labels: dtcs
 Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x

 Attachments: cassandra_sstable_metadata_reader.py, 
 cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, 
 compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, 
 first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip


 DateTieredCompaction works correctly when data is dumped for a certain time 
 period in short SSTables in time manner and then compacted together. However, 
 if TTL is applied to the data columns the DTCS fails to compact files 
 correctly in timely manner. In our opinion the problem is caused by two 
 issues:
 A) During the DateTieredCompaction process the getFullyExpiredSStables is 
 called twice. First from the DateTieredCompactionStrategy class and second 
 time from the CompactionTask class. On the first time the target is to find 
 out fully expired SStables that are not overlapping with any non-fully 
 expired SSTables. That works correctly. When the getFullyExpiredSSTables is 
 called second time from CompactionTask class the selection of fully expired 
 SSTables is modified compared to the first selection.
 B) The minimum timestamp of the new SSTables created by combining together 
 fully expired SSTable and files from the most interesting bucket is not 
 correct.
 These two issues together cause problems for the DTCS process when it 
 combines together SSTables having overlap in time and TTL for the column. 
 This is demonstrated by generating test data first without compactions and 
 showing the timely distribution of files. When the compaction is enabled the 
 DCTS combines files together, but the end result is not something to be 
 expected. This is demonstrated in the file motivation_jira.txt
 Attachments contain following material:
 - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL
 - Explanation_jira.txt: gives more details, explains test cases and 
 demonstrates the problems in the compaction process
 - Logfile file for the compactions in the first test case 
 (compaction_stage_test01_jira.log)
 - Logfile file for the compactions in the seconnd test case 
 (compaction_stage_test02_jira.log)
 - source code zip file for version 2.1.5 with additional comment statements 
 (src_2.1.5_with_debug.zip)
 - Python script to generate test data (datagen.py)
 - Python script to read metadata from SStables 
 (cassandra_sstable_metadata_reader.py)
 - Python script to generate timeline representation of SSTables 
 (cassandra_sstable_timespan_graph.py)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.

2015-06-12 Thread Antti Nissinen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583313#comment-14583313
 ] 

Antti Nissinen edited comment on CASSANDRA-9572 at 6/12/15 12:01 PM:
-

I ran the test again and looked at the log file in detail. Now it works as 
expected. Thank you very much for all involved!


was (Author: anissinen):
I ran the tests again and looked at the log file in detail. Now it works as 
expected. Thank you very much for all involved!

 DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is 
 used.
 --

 Key: CASSANDRA-9572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Antti Nissinen
Assignee: Marcus Eriksson
  Labels: dtcs
 Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x

 Attachments: cassandra_sstable_metadata_reader.py, 
 cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, 
 compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, 
 first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip


 DateTieredCompaction works correctly when data is dumped for a certain time 
 period in short SSTables in time manner and then compacted together. However, 
 if TTL is applied to the data columns the DTCS fails to compact files 
 correctly in timely manner. In our opinion the problem is caused by two 
 issues:
 A) During the DateTieredCompaction process the getFullyExpiredSStables is 
 called twice. First from the DateTieredCompactionStrategy class and second 
 time from the CompactionTask class. On the first time the target is to find 
 out fully expired SStables that are not overlapping with any non-fully 
 expired SSTables. That works correctly. When the getFullyExpiredSSTables is 
 called second time from CompactionTask class the selection of fully expired 
 SSTables is modified compared to the first selection.
 B) The minimum timestamp of the new SSTables created by combining together 
 fully expired SSTable and files from the most interesting bucket is not 
 correct.
 These two issues together cause problems for the DTCS process when it 
 combines together SSTables having overlap in time and TTL for the column. 
 This is demonstrated by generating test data first without compactions and 
 showing the timely distribution of files. When the compaction is enabled the 
 DCTS combines files together, but the end result is not something to be 
 expected. This is demonstrated in the file motivation_jira.txt
 Attachments contain following material:
 - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL
 - Explanation_jira.txt: gives more details, explains test cases and 
 demonstrates the problems in the compaction process
 - Logfile file for the compactions in the first test case 
 (compaction_stage_test01_jira.log)
 - Logfile file for the compactions in the seconnd test case 
 (compaction_stage_test02_jira.log)
 - source code zip file for version 2.1.5 with additional comment statements 
 (src_2.1.5_with_debug.zip)
 - Python script to generate test data (datagen.py)
 - Python script to read metadata from SStables 
 (cassandra_sstable_metadata_reader.py)
 - Python script to generate timeline representation of SSTables 
 (cassandra_sstable_timespan_graph.py)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9572) DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is used.

2015-06-12 Thread Antti Nissinen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583313#comment-14583313
 ] 

Antti Nissinen commented on CASSANDRA-9572:
---

I ran the tests again and looked at the log file in detail. Now it works as 
expected. Thank you very much for all involved!

 DateTieredCompactionStrategy fails to combine SSTables correctly when TTL is 
 used.
 --

 Key: CASSANDRA-9572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9572
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Antti Nissinen
Assignee: Marcus Eriksson
  Labels: dtcs
 Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x

 Attachments: cassandra_sstable_metadata_reader.py, 
 cassandra_sstable_timespan_graph.py, compaction_stage_test01_jira.log, 
 compaction_stage_test02_jira.log, datagen.py, explanation_jira.txt, 
 first_results_after_patch.txt, motivation_jira.txt, src_2.1.5_with_debug.zip


 DateTieredCompaction works correctly when data is dumped for a certain time 
 period in short SSTables in time manner and then compacted together. However, 
 if TTL is applied to the data columns the DTCS fails to compact files 
 correctly in timely manner. In our opinion the problem is caused by two 
 issues:
 A) During the DateTieredCompaction process the getFullyExpiredSStables is 
 called twice. First from the DateTieredCompactionStrategy class and second 
 time from the CompactionTask class. On the first time the target is to find 
 out fully expired SStables that are not overlapping with any non-fully 
 expired SSTables. That works correctly. When the getFullyExpiredSSTables is 
 called second time from CompactionTask class the selection of fully expired 
 SSTables is modified compared to the first selection.
 B) The minimum timestamp of the new SSTables created by combining together 
 fully expired SSTable and files from the most interesting bucket is not 
 correct.
 These two issues together cause problems for the DTCS process when it 
 combines together SSTables having overlap in time and TTL for the column. 
 This is demonstrated by generating test data first without compactions and 
 showing the timely distribution of files. When the compaction is enabled the 
 DCTS combines files together, but the end result is not something to be 
 expected. This is demonstrated in the file motivation_jira.txt
 Attachments contain following material:
 - Motivation_jira.txt: Practical examples how the DTCS behaves with TTL
 - Explanation_jira.txt: gives more details, explains test cases and 
 demonstrates the problems in the compaction process
 - Logfile file for the compactions in the first test case 
 (compaction_stage_test01_jira.log)
 - Logfile file for the compactions in the seconnd test case 
 (compaction_stage_test02_jira.log)
 - source code zip file for version 2.1.5 with additional comment statements 
 (src_2.1.5_with_debug.zip)
 - Python script to generate test data (datagen.py)
 - Python script to read metadata from SStables 
 (cassandra_sstable_metadata_reader.py)
 - Python script to generate timeline representation of SSTables 
 (cassandra_sstable_timespan_graph.py)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

[
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583533#comment-14583533
]

Sylvain Lebresne commented on CASSANDRA-8099:
-

bq. I've pushed a small semantic-changing suggestion for serialization and
merging of RT

Thanks. I hesitated doing this initially and don't remember why I didn't. But
this does clean up things a bit so I'll look at integrating it on Monday unless
I remember a good reason not to (which there probably isn't).

Refactor and modernize the storage engine
-

Key: CASSANDRA-8099
URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 3.0 beta 1

Attachments: 8099-nit

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9577) Cassandra not performing GC on stale SStables after compaction

2015-06-12 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583376#comment-14583376
 ] 

Marcus Eriksson commented on CASSANDRA-9577:


So, the lsof output and data directory contents are after the compaction? How 
long after?  The sstables are not deleted immediately, the deletion is done in 
the background.

 Cassandra not performing GC on stale SStables after compaction
 --

 Key: CASSANDRA-9577
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9577
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 2.0.12.200 / DSE 4.6.1.
Reporter: Jeff Ferland
Assignee: Marcus Eriksson

   Space used (live), bytes:   878681716067
   Space used (total), bytes: 2227857083852
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ sudo lsof *-Data.db 
 COMMAND  PID  USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
 java4473 cassandra  446r   REG   0,26  17582559172 39241 
 trends-trends-jb-144864-Data.db
 java4473 cassandra  448r   REG   0,26 62040962 37431 
 trends-trends-jb-144731-Data.db
 java4473 cassandra  449r   REG   0,26 829935047545 21150 
 trends-trends-jb-143581-Data.db
 java4473 cassandra  452r   REG   0,26  8980406 39503 
 trends-trends-jb-144882-Data.db
 java4473 cassandra  454r   REG   0,26  8980406 39503 
 trends-trends-jb-144882-Data.db
 java4473 cassandra  462r   REG   0,26  9487703 39542 
 trends-trends-jb-144883-Data.db
 java4473 cassandra  463r   REG   0,26 36158226 39629 
 trends-trends-jb-144889-Data.db
 java4473 cassandra  468r   REG   0,26105693505 39447 
 trends-trends-jb-144881-Data.db
 java4473 cassandra  530r   REG   0,26  17582559172 39241 
 trends-trends-jb-144864-Data.db
 java4473 cassandra  535r   REG   0,26105693505 39447 
 trends-trends-jb-144881-Data.db
 java4473 cassandra  542r   REG   0,26  9487703 39542 
 trends-trends-jb-144883-Data.db
 java4473 cassandra  553u   REG   0,26   6431729821 39556 
 trends-trends-tmp-jb-144884-Data.db
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ ls *-Data.db
 trends-trends-jb-142631-Data.db  trends-trends-jb-143562-Data.db  
 trends-trends-jb-143581-Data.db  trends-trends-jb-144731-Data.db  
 trends-trends-jb-144883-Data.db
 trends-trends-jb-142633-Data.db  trends-trends-jb-143563-Data.db  
 trends-trends-jb-144530-Data.db  trends-trends-jb-144864-Data.db  
 trends-trends-jb-144889-Data.db
 trends-trends-jb-143026-Data.db  trends-trends-jb-143564-Data.db  
 trends-trends-jb-144551-Data.db  trends-trends-jb-144881-Data.db  
 trends-trends-tmp-jb-144884-Data.db
 trends-trends-jb-143533-Data.db  trends-trends-jb-143578-Data.db  
 trends-trends-jb-144552-Data.db  trends-trends-jb-144882-Data.db
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ cd -
 /mnt/cassandra/data/trends/trends
 jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ sudo lsof * 
 jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ ls *-Data.db
 trends-trends-jb-124502-Data.db  trends-trends-jb-141113-Data.db  
 trends-trends-jb-141377-Data.db  trends-trends-jb-141846-Data.db  
 trends-trends-jb-144890-Data.db
 trends-trends-jb-125457-Data.db  trends-trends-jb-141123-Data.db  
 trends-trends-jb-141391-Data.db  trends-trends-jb-141871-Data.db  
 trends-trends-jb-41121-Data.db
 trends-trends-jb-130016-Data.db  trends-trends-jb-141137-Data.db  
 trends-trends-jb-141538-Data.db  trends-trends-jb-141883-Data.db  
 trends-trends.trends_date_idx-jb-2100-Data.db
 trends-trends-jb-139563-Data.db  trends-trends-jb-141358-Data.db  
 trends-trends-jb-141806-Data.db  trends-trends-jb-142033-Data.db
 trends-trends-jb-141102-Data.db  trends-trends-jb-141363-Data.db  
 trends-trends-jb-141829-Data.db  trends-trends-jb-144553-Data.db
 Compaction started  INFO [CompactionExecutor:6661] 2015-06-05 14:02:36,515 
 CompactionTask.java (line 120) Compacting 
 [SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-124502-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141358-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141883-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141846-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141871-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141391-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-139563-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-125457-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141806-Data.db'),

[jira] [Commented] (CASSANDRA-9577) Cassandra not performing GC on stale SStables after compaction

2015-06-12 Thread Jeff Ferland (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583407#comment-14583407
 ] 

Jeff Ferland commented on CASSANDRA-9577:
-

Per timestamps above, this was the case more than 24 hours after completion. 
Restarting the host did clear the files. On the same host just after a restart 
I'm encountering the same pattern.

 Cassandra not performing GC on stale SStables after compaction
 --

 Key: CASSANDRA-9577
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9577
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 2.0.12.200 / DSE 4.6.1.
Reporter: Jeff Ferland
Assignee: Marcus Eriksson

   Space used (live), bytes:   878681716067
   Space used (total), bytes: 2227857083852
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ sudo lsof *-Data.db 
 COMMAND  PID  USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
 java4473 cassandra  446r   REG   0,26  17582559172 39241 
 trends-trends-jb-144864-Data.db
 java4473 cassandra  448r   REG   0,26 62040962 37431 
 trends-trends-jb-144731-Data.db
 java4473 cassandra  449r   REG   0,26 829935047545 21150 
 trends-trends-jb-143581-Data.db
 java4473 cassandra  452r   REG   0,26  8980406 39503 
 trends-trends-jb-144882-Data.db
 java4473 cassandra  454r   REG   0,26  8980406 39503 
 trends-trends-jb-144882-Data.db
 java4473 cassandra  462r   REG   0,26  9487703 39542 
 trends-trends-jb-144883-Data.db
 java4473 cassandra  463r   REG   0,26 36158226 39629 
 trends-trends-jb-144889-Data.db
 java4473 cassandra  468r   REG   0,26105693505 39447 
 trends-trends-jb-144881-Data.db
 java4473 cassandra  530r   REG   0,26  17582559172 39241 
 trends-trends-jb-144864-Data.db
 java4473 cassandra  535r   REG   0,26105693505 39447 
 trends-trends-jb-144881-Data.db
 java4473 cassandra  542r   REG   0,26  9487703 39542 
 trends-trends-jb-144883-Data.db
 java4473 cassandra  553u   REG   0,26   6431729821 39556 
 trends-trends-tmp-jb-144884-Data.db
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ ls *-Data.db
 trends-trends-jb-142631-Data.db  trends-trends-jb-143562-Data.db  
 trends-trends-jb-143581-Data.db  trends-trends-jb-144731-Data.db  
 trends-trends-jb-144883-Data.db
 trends-trends-jb-142633-Data.db  trends-trends-jb-143563-Data.db  
 trends-trends-jb-144530-Data.db  trends-trends-jb-144864-Data.db  
 trends-trends-jb-144889-Data.db
 trends-trends-jb-143026-Data.db  trends-trends-jb-143564-Data.db  
 trends-trends-jb-144551-Data.db  trends-trends-jb-144881-Data.db  
 trends-trends-tmp-jb-144884-Data.db
 trends-trends-jb-143533-Data.db  trends-trends-jb-143578-Data.db  
 trends-trends-jb-144552-Data.db  trends-trends-jb-144882-Data.db
 jbf@ip-10-0-2-98:/ebs/cassandra/data/trends/trends$ cd -
 /mnt/cassandra/data/trends/trends
 jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ sudo lsof * 
 jbf@ip-10-0-2-98:/mnt/cassandra/data/trends/trends$ ls *-Data.db
 trends-trends-jb-124502-Data.db  trends-trends-jb-141113-Data.db  
 trends-trends-jb-141377-Data.db  trends-trends-jb-141846-Data.db  
 trends-trends-jb-144890-Data.db
 trends-trends-jb-125457-Data.db  trends-trends-jb-141123-Data.db  
 trends-trends-jb-141391-Data.db  trends-trends-jb-141871-Data.db  
 trends-trends-jb-41121-Data.db
 trends-trends-jb-130016-Data.db  trends-trends-jb-141137-Data.db  
 trends-trends-jb-141538-Data.db  trends-trends-jb-141883-Data.db  
 trends-trends.trends_date_idx-jb-2100-Data.db
 trends-trends-jb-139563-Data.db  trends-trends-jb-141358-Data.db  
 trends-trends-jb-141806-Data.db  trends-trends-jb-142033-Data.db
 trends-trends-jb-141102-Data.db  trends-trends-jb-141363-Data.db  
 trends-trends-jb-141829-Data.db  trends-trends-jb-144553-Data.db
 Compaction started  INFO [CompactionExecutor:6661] 2015-06-05 14:02:36,515 
 CompactionTask.java (line 120) Compacting 
 [SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-124502-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141358-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141883-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141846-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141871-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141391-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-139563-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-125457-Data.db'),
  
 SSTableReader(path='/mnt/cassandra/data/trends/trends/trends-trends-jb-141806-Data.db'),

[jira] [Updated] (CASSANDRA-6710) Support union types


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6710:
-
Fix Version/s: 3.x

 Support union types
 ---

 Key: CASSANDRA-6710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6710
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Tupshin Harper
Priority: Minor
  Labels: ponies
 Fix For: 3.x


 I sometimes find myself wanting to abuse Cassandra datatypes when I want to 
 interleave two different types in the same column.
 An example is in CASSANDRA-6167 where an approach is to tag what would 
 normally be a numeric field with text indicating that it is special in some 
 ways.
 A more elegant approach would be to be able to explicitly define disjoint 
 unions in the style of Haskell's and Scala's Either types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583530#comment-14583530
 ] 

Sylvain Lebresne commented on CASSANDRA-8099:
-

Some update on this. I've pushed a rebased (and squashed because that made it a 
*lot* easier to rebase) version in [my 8099 
branch|https://github.com/pcmanus/cassandra/tree/8099].

It's still missing wire backward compatibility ([~thobbs] is finishing this so 
this should be ready hopefully soon). Regarding tests:
* unit tests are almost green: mostly it remains some failures in the hadoop 
tests. I could actually use the experience of someone that knows these tests 
and the code involved as it's not immediately clear to me what this is even 
doing.
* dtests still have a fair amount of failure but I've only look at them 
recently and it's getting down quickly.

h2. OpOrder

I think the main problem was that a local read (done through 
{{SP.LocalReadRunnable}}) was potentially keeping a group open while waiting 
on other nodes. I also realized this path meant local reads (the actual read of 
sstables) were done outside of the {{StorageProxy} methods, and so 1) not on 
the thread they were supposed to be on and 2) outside of the timeout check. I 
changed this so that a local response actually materialize everything upfront 
(similarly to what we do today), solving the problems above. This is not 
perfect and I'm sure we'll improve on this in the future, but that feels like a 
good enough option initially.

Regarding moving {{OpOrder}} out of {{close}}, the only way to do that I can 
see is be to move it up the stack, in {{ReadCommandVerbHandler}} and 
{{SP.LocalReadRunnable}} (as suggested by Brananir above). I'm working on that 
(I just started and might not have the time to finish today, but it'll be done 
early monday for sure).

h2. Branamir's review remarks

I've integrated fixes for most of the remarks. I discuss the rest below.

bq. [CompactionIterable 
125|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L125]:
 I doubt index update belongs here, as side effect of iteration. Ideally index 
should be collected, not updated.

Though I don't disagree on principle, this is not different from how it's done 
currently (it's done the same in {{LazilyCompactRow}}, but it just happens that 
the old {{LazilyCompactedRow}} has been merged to {{CompactionIterable}} (now 
{{CompactionIterator}}) because simplifications of the patch made it 
unnecessary to have separate classes). Happy to look at cleaning this in a 
separate ticket however (probably belongs to cleaning the 2ndary index API in 
practice).

bq. [CompactionIterable 
237|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L237]:
 Another side effect of iteration that preferably should be handled by writer.

Maybe, but it's not that simple. Merging (which is done directly by the 
{{CompactionIterator}}) gets rid of empty partitions and more generally we get 
rid of them as soon as possible. I think that it's the right thing to do as 
it's easier for the rest of the code, but that means we have to do invalidation 
in {{CompactionIterator}}. Of course, we could special case 
{{CompactionIterator}} to not remove empty partitions and do cache invalidation 
externally, but I'm not sure it would be cleaner overall (it would somewhat 
more error-prone). Besides, I could argue that cache invalidation is very much 
a product of compaction and having it in {{CompactionIterator}} is not that bad.

bq. Validation compaction now uses CompactionIterable and thus has side effects 
(index  cache removal).

I've fixed that but I'll note for posterity that as far as I can tell, index 
removal is done for validation compaction on trunk (and all previous version) 
due to the use of {{LazilyCompactedRow}}. I've still disabled it (for anything 
that wasn't a true compaction) because I think that's the right thing to do, 
but that's a difference of this ticket.

bq. add that there is never content between two corresponding tombstone markers 
on any iterator.

That's mentioned in Dealing with tombstones and shadowed cells. More 
precisely, that's what it's part of the contract of an AtomIterator that it 
must not shadow it's own data means. But I need to clean up/update the guide 
so I'll make sure to clarify further while at it.

bq. These objects should be Iterable instead. Having that would give clear 
separation between the iteration process and the entity-level data

Yes, it would be cleaner from that standpoint. And the use of iterators in the 
first place is indeed largely carried from the existing code, I just hadn't 
really though of the alternative tbh. I'll try to check next week how easily 
such

[jira] [Commented] (CASSANDRA-9487) CommitLogTest hangs intermittently in 2.0

2015-06-12 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583542#comment-14583542
 ] 

Branimir Lambov commented on CASSANDRA-9487:


Ported the patch and uploaded it 
[here|https://github.com/apache/cassandra/compare/trunk...blambov:9487-2.0-cl-test-hang].

I tested that it normally succeeds, but I could not verify that it stops the 
hangs, because I could not reproduce the hanging in the first place. The log 
looks like it's the same problem, though.

 CommitLogTest hangs intermittently in 2.0
 -

 Key: CASSANDRA-9487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9487
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Michael Shuler
Assignee: Branimir Lambov
 Fix For: 2.0.x

 Attachments: system.log


 Possibly related to CASSANDRA-8992 ?
 2.0 unit tests are hanging periodically in the same way (I have not gone 
 through all the branches, so can't say we're in the clear everywhere - 
 marking for just 2.x at the moment). CommitLogTest hung system.log attached 
 from local reproduction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: undeprecate cache recentHitRate metrics

2015-06-12 Thread benedict

Repository: cassandra
Updated Branches:
  refs/heads/trunk b1abcd048 - 8c19fd638


undeprecate cache recentHitRate metrics

patch by Chris Burroughs; reviewed by benedict for CASSANDRA-6591


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8c19fd63
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8c19fd63
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8c19fd63

Branch: refs/heads/trunk
Commit: 8c19fd638da7d5525e85d0cce41aa86e02798108
Parents: b1abcd0
Author: Chris Burroughs chris.burroughs+apa...@gmail.com
Authored: Fri Jun 12 12:22:30 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Jun 12 12:22:30 2015 +0100

--
 CHANGES.txt |   1 +
 .../apache/cassandra/metrics/CacheMetrics.java  |  29 -
 .../org/apache/cassandra/utils/DynamicList.java |   2 +-
 .../apache/cassandra/utils/FasterRandom.java| 116 ---
 .../cassandra/stress/generate/FasterRandom.java | 116 +++
 .../cassandra/stress/generate/values/Bytes.java |   2 +-
 .../stress/generate/values/Strings.java |   2 +-
 7 files changed, 148 insertions(+), 120 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b80f272..35e02a2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -5,6 +5,7 @@
  * Change gossip stabilization to use endpoit size (CASSANDRA-9401)
  * Change default garbage collector to G1 (CASSANDRA-7486)
  * Populate TokenMetadata early during startup (CASSANDRA-9317)
+ * undeprecate cache recentHitRate (CASSANDRA-6591)
 
 
 2.2

http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/src/java/org/apache/cassandra/metrics/CacheMetrics.java
--
diff --git a/src/java/org/apache/cassandra/metrics/CacheMetrics.java 
b/src/java/org/apache/cassandra/metrics/CacheMetrics.java
index 8b00e1c..151268b 100644
--- a/src/java/org/apache/cassandra/metrics/CacheMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/CacheMetrics.java
@@ -37,8 +37,14 @@ public class CacheMetrics
 public final Meter hits;
 /** Total number of cache requests */
 public final Meter requests;
-/** cache hit rate */
+/** all time cache hit rate */
 public final GaugeDouble hitRate;
+/** 1m hit rate */
+public final GaugeDouble oneMinuteHitRate;
+/** 5m hit rate */
+public final GaugeDouble fiveMinuteHitRate;
+/** 15m hit rate */
+public final GaugeDouble fifteenMinuteHitRate;
 /** Total size of cache, in bytes */
 public final GaugeLong size;
 /** Total number of cache entries */
@@ -71,6 +77,27 @@ public class CacheMetrics
 return Ratio.of(hits.getCount(), requests.getCount());
 }
 });
+oneMinuteHitRate = 
Metrics.register(factory.createMetricName(OneMinuteHitRate), new RatioGauge()
+{
+protected Ratio getRatio()
+{
+return Ratio.of(hits.getOneMinuteRate(), 
requests.getOneMinuteRate());
+}
+});
+fiveMinuteHitRate = 
Metrics.register(factory.createMetricName(FiveMinuteHitRate), new RatioGauge()
+{
+protected Ratio getRatio()
+{
+return Ratio.of(hits.getFiveMinuteRate(), 
requests.getFiveMinuteRate());
+}
+});
+fifteenMinuteHitRate = 
Metrics.register(factory.createMetricName(FifteenMinuteHitRate), new 
RatioGauge()
+{
+protected Ratio getRatio()
+{
+return Ratio.of(hits.getFifteenMinuteRate(), 
requests.getFifteenMinuteRate());
+}
+});
 size = Metrics.register(factory.createMetricName(Size), new 
GaugeLong()
 {
 public Long getValue()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/8c19fd63/src/java/org/apache/cassandra/utils/DynamicList.java
--
diff --git a/src/java/org/apache/cassandra/utils/DynamicList.java 
b/src/java/org/apache/cassandra/utils/DynamicList.java
index fc3d523..30f5160 100644
--- a/src/java/org/apache/cassandra/utils/DynamicList.java
+++ b/src/java/org/apache/cassandra/utils/DynamicList.java
@@ -238,7 +238,7 @@ public class DynamicListE
 canon.add(c);
 c++;
 }
-FasterRandom rand = new FasterRandom();
+ThreadLocalRandom rand = ThreadLocalRandom.current();
 assert list.isWellFormed();
 for (int loop = 0 ; loop  100 ; loop++)
 {

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

[
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583553#comment-14583553
]

Benedict commented on CASSANDRA-8099:
-

bq. Yes, it would be cleaner from that standpoint. And the use of iterators in
the first place is indeed largely carried from the existing code, I just hadn't
really though of the alternative tbh. I'll try to check next week how easily
such change is. That said, I'm not sure the use of iterators directly is that
confusing either, so if it turns hairy, I don't think it's worth blocking on
this (that is, we can very well change that later).

It does change the semantics quite a bit, since the state needed for iterating
must be constructed again each time, and is likely constructed in the caller of
.iterator(). This has both advantages and disadvantages. One advantage of an
Iterator, though, is that you cannot (easily) iterate over its contents twice.
I'm personally not so upset at the use of Iterator, since it's a continuation
of the existing approach, and Java 8 makes working with iterators a little
easier. We can, for instance, make use of the forEachRemaining() method, or
otherwise transform the iterator. I don't think there's any increased ugliness
inherent in exposing the higher-level information in the Iterator, though.

I believe [~iamaleksey] is working on a way to integrate the Java Streams API
at some point in the future, which may lead to other benefits that Iterable
cannot deliver.

Either way, I think getting this ticket in sooner than later is better, and we
can focus on how we might make the Iterator abstraction a little nicer in a
follow up.

Refactor and modernize the storage engine
-

Key: CASSANDRA-8099
URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 3.0 beta 1

Attachments: 8099-nit

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/4] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/271c9e4a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/271c9e4a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/271c9e4a

Branch: refs/heads/trunk
Commit: 271c9e4ac7a71252e4f4f1984fd4f8f16058bcde
Parents: b61da9b 69b7dd3
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:51:48 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:51:48 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/271c9e4a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--

[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections

2015-06-12 Thread Vishy Kasar (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583700#comment-14583700
 ] 

Vishy Kasar commented on CASSANDRA-9590:


Thanks, this is similar to one of the features we requested: 
https://issues.apache.org/jira/browse/CASSANDRA-8803 : Implement transitional 
mode in C* that will accept both encrypted and non-encrypted client traffic

 Support for both encrypted and unencrypted native transport connections
 ---

 Key: CASSANDRA-9590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stefan Podkowinski

 Enabling encryption for native transport currently turns SSL exclusively on 
 or off for the opened socket. Migrating from plain to encrypted requires to 
 migrate all native clients as well and redeploy all of them at the same time 
 after starting the SSL enabled Cassandra nodes. 
 This patch would allow to start Cassandra with both an unencrypted and ssl 
 enabled native port. Clients can connect to either, based whether they 
 support ssl or not.
 This has been implemented by introducing a new {{native_transport_port_ssl}} 
 config option. 
 There would be three scenarios:
 * client encryption disabled: native_transport_port unencrypted, port_ssl not 
 used
 * client encryption enabled, port_ssl not set: encrypted native_transport_port
 * client encryption enabled and port_ssl set: native_transport_port 
 unencrypted, port_ssl encrypted
 This approach would keep configuration behavior fully backwards compatible.
 Patch proposal (tests will be added later in case people will speak out in 
 favor for the patch):
 [Diff 
 trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl],
  
 [Patch against 
 trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing

mck created CASSANDRA-9591:
--

 Summary: Scrub (recover) sstables even when -Index.db is missing
 Key: CASSANDRA-9591
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck
Assignee: mck


Today SSTableReader needs at minimum 3 files to load an sstable:
 - -Data.db
 - -CompressionInfo.db 
 - -Index.db

But during the scrub process the -Index.db file isn't actually necessary, 
unless there's corruption in the -Data.db and we want to be able to skip over 
corrupted rows. Given that there is still a fair chance that there's nothing 
wrong with the -Data.db file and we're just missing the -Index.db file this 
patch addresses that situation.

So the following patch makes it possible for the StandaloneScrubber 
(sstablescrub) to recover sstables despite missing -Index.db files.


This can happen from a catastrophic incident where data directories have been 
lost and/or corrupted, or wiped and the backup not healthy. I'm aware that 
normally one depends on replicas or snapshots to avoid such situations, but 
such catastrophic incidents do occur in the wild.

I have not tested this patch against normal c* operations and all the other 
(more critical) ways SSTableReader is used. i'll happily do that and add the 
needed units tests if people see merit in accepting the patch.

Otherwise the patch can live with the issue, in-case anyone else needs it. I've 
uploaded a cassandra distribution bundled with the patch as well to make life a 
little easier for anyone finding themselves in such a bad situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/3] cassandra git commit: Ignore fully expired sstables when finding min timestamp

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 b61da9b56 - 271c9e4ac


Ignore fully expired sstables when finding min timestamp

Patch by marcuse; reviewed by slebresne for CASSANDRA-9572


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f

Branch: refs/heads/cassandra-2.2
Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064
Parents: 3ddd17b
Author: Marcus Eriksson marc...@apache.org
Authored: Thu Jun 11 08:33:54 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:50:01 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionController.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
index 7a4b7d9..59453cc 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
@@ -102,7 +102,12 @@ public class CompactionController
 long minTimestamp = Long.MAX_VALUE;
 
 for (SSTableReader sstable : overlapping)
-minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp());
+{
+// Overlapping might include fully expired sstables. What we care 
about here is
+// the min timestamp of the overlapping sstables that actually 
contain live data.
+if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore)
+minTimestamp = Math.min(minTimestamp, 
sstable.getMinTimestamp());
+}
 
 for (SSTableReader candidate : compacting)
 {

[1/2] cassandra git commit: Ignore fully expired sstables when finding min timestamp

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 16665ee19 - 69b7dd327


Ignore fully expired sstables when finding min timestamp

Patch by marcuse; reviewed by slebresne for CASSANDRA-9572


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f

Branch: refs/heads/cassandra-2.1
Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064
Parents: 3ddd17b
Author: Marcus Eriksson marc...@apache.org
Authored: Thu Jun 11 08:33:54 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:50:01 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionController.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
index 7a4b7d9..59453cc 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
@@ -102,7 +102,12 @@ public class CompactionController
 long minTimestamp = Long.MAX_VALUE;
 
 for (SSTableReader sstable : overlapping)
-minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp());
+{
+// Overlapping might include fully expired sstables. What we care 
about here is
+// the min timestamp of the overlapping sstables that actually 
contain live data.
+if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore)
+minTimestamp = Math.min(minTimestamp, 
sstable.getMinTimestamp());
+}
 
 for (SSTableReader candidate : compacting)
 {

[2/3] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32

Branch: refs/heads/cassandra-2.2
Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5
Parents: 16665ee 9e60611
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:51:39 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:51:39 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--

cassandra git commit: Ignore fully expired sstables when finding min timestamp

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 3ddd17b77 - 9e60611fb


Ignore fully expired sstables when finding min timestamp

Patch by marcuse; reviewed by slebresne for CASSANDRA-9572


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f

Branch: refs/heads/cassandra-2.0
Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064
Parents: 3ddd17b
Author: Marcus Eriksson marc...@apache.org
Authored: Thu Jun 11 08:33:54 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:50:01 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionController.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
index 7a4b7d9..59453cc 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
@@ -102,7 +102,12 @@ public class CompactionController
 long minTimestamp = Long.MAX_VALUE;
 
 for (SSTableReader sstable : overlapping)
-minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp());
+{
+// Overlapping might include fully expired sstables. What we care 
about here is
+// the min timestamp of the overlapping sstables that actually 
contain live data.
+if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore)
+minTimestamp = Math.min(minTimestamp, 
sstable.getMinTimestamp());
+}
 
 for (SSTableReader candidate : compacting)
 {

[2/2] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32

Branch: refs/heads/cassandra-2.1
Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5
Parents: 16665ee 9e60611
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:51:39 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:51:39 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--

[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/271c9e4a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/271c9e4a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/271c9e4a

Branch: refs/heads/cassandra-2.2
Commit: 271c9e4ac7a71252e4f4f1984fd4f8f16058bcde
Parents: b61da9b 69b7dd3
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:51:48 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:51:48 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/271c9e4a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--

[4/4] cassandra git commit: Merge branch 'cassandra-2.2' into trunk

Merge branch 'cassandra-2.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2e54ddc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2e54ddc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2e54ddc

Branch: refs/heads/trunk
Commit: c2e54ddc3f47912814323dcc4fca45300db2c518
Parents: 8c19fd6 271c9e4
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:52:05 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:52:05 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--

[1/4] cassandra git commit: Ignore fully expired sstables when finding min timestamp

Repository: cassandra
Updated Branches:
  refs/heads/trunk 8c19fd638 - c2e54ddc3


Ignore fully expired sstables when finding min timestamp

Patch by marcuse; reviewed by slebresne for CASSANDRA-9572


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e60611f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e60611f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e60611f

Branch: refs/heads/trunk
Commit: 9e60611fb807ad1bd03a13ef1fe55bf905100064
Parents: 3ddd17b
Author: Marcus Eriksson marc...@apache.org
Authored: Thu Jun 11 08:33:54 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:50:01 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e60611f/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionController.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
index 7a4b7d9..59453cc 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java
@@ -102,7 +102,12 @@ public class CompactionController
 long minTimestamp = Long.MAX_VALUE;
 
 for (SSTableReader sstable : overlapping)
-minTimestamp = Math.min(minTimestamp, sstable.getMinTimestamp());
+{
+// Overlapping might include fully expired sstables. What we care 
about here is
+// the min timestamp of the overlapping sstables that actually 
contain live data.
+if (sstable.getSSTableMetadata().maxLocalDeletionTime = gcBefore)
+minTimestamp = Math.min(minTimestamp, 
sstable.getMinTimestamp());
+}
 
 for (SSTableReader candidate : compacting)
 {

[2/4] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69b7dd32
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69b7dd32
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69b7dd32

Branch: refs/heads/trunk
Commit: 69b7dd327443239b70a104dfe960bd0aa2ccf0a5
Parents: 16665ee 9e60611
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jun 12 18:51:39 2015 +0200
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jun 12 18:51:39 2015 +0200

--
 .../apache/cassandra/db/compaction/CompactionController.java  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69b7dd32/src/java/org/apache/cassandra/db/compaction/CompactionController.java
--

[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-9591:
---
Fix Version/s: 2.0.15

 Scrub (recover) sstables even when -Index.db is missing
 ---

 Key: CASSANDRA-9591
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck
Assignee: mck
 Fix For: 2.0.15


 Today SSTableReader needs at minimum 3 files to load an sstable:
  - -Data.db
  - -CompressionInfo.db 
  - -Index.db
 But during the scrub process the -Index.db file isn't actually necessary, 
 unless there's corruption in the -Data.db and we want to be able to skip over 
 corrupted rows. Given that there is still a fair chance that there's nothing 
 wrong with the -Data.db file and we're just missing the -Index.db file this 
 patch addresses that situation.
 So the following patch makes it possible for the StandaloneScrubber 
 (sstablescrub) to recover sstables despite missing -Index.db files.
 This can happen from a catastrophic incident where data directories have been 
 lost and/or corrupted, or wiped and the backup not healthy. I'm aware that 
 normally one depends on replicas or snapshots to avoid such situations, but 
 such catastrophic incidents do occur in the wild.
 I have not tested this patch against normal c* operations and all the other 
 (more critical) ways SSTableReader is used. i'll happily do that and add the 
 needed units tests if people see merit in accepting the patch.
 Otherwise the patch can live with the issue, in-case anyone else needs it. 
 I've uploaded a cassandra distribution bundled with the patch as well to make 
 life a little easier for anyone finding themselves in such a bad situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-9591:
---
Attachment: 9591-2.0.txt

 Scrub (recover) sstables even when -Index.db is missing
 ---

 Key: CASSANDRA-9591
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck
Assignee: mck
  Labels: sstablescrub
 Fix For: 2.0.15

 Attachments: 9591-2.0.txt


 Today SSTableReader needs at minimum 3 files to load an sstable:
  - -Data.db
  - -CompressionInfo.db 
  - -Index.db
 But during the scrub process the -Index.db file isn't actually necessary, 
 unless there's corruption in the -Data.db and we want to be able to skip over 
 corrupted rows. Given that there is still a fair chance that there's nothing 
 wrong with the -Data.db file and we're just missing the -Index.db file this 
 patch addresses that situation.
 So the following patch makes it possible for the StandaloneScrubber 
 (sstablescrub) to recover sstables despite missing -Index.db files.
 This can happen from a catastrophic incident where data directories have been 
 lost and/or corrupted, or wiped and the backup not healthy. I'm aware that 
 normally one depends on replicas or snapshots to avoid such situations, but 
 such catastrophic incidents do occur in the wild.
 I have not tested this patch against normal c* operations and all the other 
 (more critical) ways SSTableReader is used. i'll happily do that and add the 
 needed units tests if people see merit in accepting the patch.
 Otherwise the patch can live with the issue, in-case anyone else needs it. 
 I've uploaded a cassandra distribution bundled with the patch as well to make 
 life a little easier for anyone finding themselves in such a bad situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9592) `Periodically attempt to submit background compaction tasks

Benedict created CASSANDRA-9592:
---

 Summary: `Periodically attempt to submit background compaction 
tasks
 Key: CASSANDRA-9592
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9592
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.x


There's more race conditions affecting compaction task submission than 
CASSANDRA-7745, so to prevent some of these problems stalling compactions, I 
propose simply submitting background compactions once every minute, if 
possible. This will typically be a no-op, but there's no harm in that, since 
it's very cheap to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9590) Support for both encrypted and unencrypted native transport connections

2015-06-12 Thread Mike Adamson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583429#comment-14583429
 ] 

Mike Adamson commented on CASSANDRA-9590:
-

Have you considered doing this with TLS instead SSL? That would allow encrypted 
and unencrypted connections over the same port.

 Support for both encrypted and unencrypted native transport connections
 ---

 Key: CASSANDRA-9590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9590
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stefan Podkowinski

 Enabling encryption for native transport currently turns SSL exclusively on 
 or off for the opened socket. Migrating from plain to encrypted requires to 
 migrate all native clients as well and redeploy all of them at the same time 
 after starting the SSL enabled Cassandra nodes. 
 This patch would allow to start Cassandra with both an unencrypted and ssl 
 enabled native port. Clients can connect to either, based whether they 
 support ssl or not.
 This has been implemented by introducing a new {{native_transport_port_ssl}} 
 config option. 
 There would be three scenarios:
 * client encryption disabled: native_transport_port unencrypted, port_ssl not 
 used
 * client encryption enabled, port_ssl not set: encrypted native_transport_port
 * client encryption enabled and port_ssl set: native_transport_port 
 unencrypted, port_ssl encrypted
 This approach would keep configuration behavior fully backwards compatible.
 Patch proposal (tests will be added later in case people will speak out in 
 favor for the patch):
 [Diff 
 trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl],
  
 [Patch against 
 trunk|https://github.com/apache/cassandra/compare/trunk...spodkowinski:feat/optionalnativessl.patch]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-7918:
---
Reviewer: Joshua McKenzie

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

2015-06-12 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809
 ] 

Jeff Jirsa edited comment on CASSANDRA-8460 at 6/12/15 6:07 PM:


{quote}yes, I've been thinking maybe adding priorities or tags to the data 
directories, but that is probably not needed now. Adding a flag to each 
data_directory that states whether it is for archival storage or not is 
probably enough for now.{quote}

Asking for clarification to make sure I don't go too far into pony land:

So my initial approach was to define a second config item, separate from 
{{data_file_directories}} entirely, so that no other code needed to be aware of 
it except for classes explicitly wanting to use `archive` tier storage ( 
{{dd.getAllDataFileLocations()}} would not return the archive tier, but rather 
add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of 
storage).  

It sounds from your description you're envisioning changing the list of 
data_file_locations to a map {noformat} 
tag1:location1,tag1:location2,tag3:location3 {noformat} or {noformat} 
tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also 
need to maintain backwards compatibility, which seems fairly straight forward 
to do (check to see if the provided {{data_files_directory}} is an old-format 
list rather than map and apply some default tag?)

The first approach is clean and isolated, unlikely to introduce surprises, but 
potentially limits us from being able to do more interesting work with tagged 
data file directories later (ie: only store data for KS W in data directories 
tagged X, and KS Y in data directories tagged Z). Can you clarify which best 
fits your expectations? 



was (Author: jjirsa):
{quote}yes, I've been thinking maybe adding priorities or tags to the data 
directories, but that is probably not needed now. Adding a flag to each 
data_directory that states whether it is for archival storage or not is 
probably enough for now.{quote}

Asking for clarification to make sure I don't go too far into pony land:

So my initial approach was to define a second config item, separate from 
{{data_file_directories}} entirely, so that no other code needed to be aware of 
it except for classes explicitly wanting to use `archive` tier storage ( 
{{dd.getAllDataFileLocations()}} would not return the archive tier, but rather 
add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of 
storage).  

It sounds from your description you're envisioning changing the list of 
data_file_locations to a map {noformat} 
[tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} 
tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also 
need to maintain backwards compatibility, which seems fairly straight forward 
to do (check to see if the provided {{data_files_directory}} is an old-format 
list rather than map and apply some default tag?)

The first approach is clean and isolated, unlikely to introduce surprises, but 
potentially limits us from being able to do more interesting work with tagged 
data file directories later (ie: only store data for KS W in data directories 
tagged X, and KS Y in data directories tagged Z). Can you clarify which best 
fits your expectations? 


 Make it possible to move non-compacting sstables to slow/big storage in DTCS
 

 Key: CASSANDRA-8460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
  Labels: dtcs

 It would be nice if we could configure DTCS to have a set of extra data 
 directories where we move the sstables once they are older than 
 max_sstable_age_days. 
 This would enable users to have a quick, small SSD for hot, new data, and big 
 spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9593) Compaction may stall due to race condition

Benedict created CASSANDRA-9593:
---

 Summary: Compaction may stall due to race condition
 Key: CASSANDRA-9593
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9593
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 2.2.x


If the maximum number of compactions are running, and they all terminate 
simultaneously, they can fail to submit any further compaction tasks. Further, 
since each only submits one on completion, we only need two of these to race 
with each other to reduce the number of active compactions below the configured 
concurrency level.

There are a couple of ways to get around this. This simplest is to submit a 
task to another thread pool to perform the submitBackgroundTask(), but this may 
be unnecessarily delayed. Another is to maintain a separate count of active 
compaction tasks, that is decremented while the thread is still serving the 
request. A partial solution is to just discount the calling thread from the 
count of active tasks, so at least one of any competitors will win.

The problem is mitigated considerably by CASSANDRA-9592, so there's no urgency 
to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9591) Scrub (recover) sstables even when -Index.db is missing

[
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

mck updated CASSANDRA-9591:
---
Description:
Today SSTableReader needs at minimum 3 files to load an sstable:
- -Data.db
- -CompressionInfo.db
- -Index.db

But during the scrub process the -Index.db file isn't actually necessary,
unless there's corruption in the -Data.db and we want to be able to skip over
corrupted rows. Given that there is still a fair chance that there's nothing
wrong with the -Data.db file and we're just missing the -Index.db file this
patch addresses that situation.

So the following patch makes it possible for the StandaloneScrubber
(sstablescrub) to recover sstables despite missing -Index.db files.

This can happen from a catastrophic incident where data directories have been
lost and/or corrupted, or wiped and the backup not healthy. I'm aware that
normally one depends on replicas or snapshots to avoid such situations, but
such catastrophic incidents do occur in the wild.

I have not tested this patch against normal c* operations and all the other
(more critical) ways SSTableReader is used. i'll happily do that and add the
needed units tests if people see merit in accepting the patch.

Otherwise the patch can live with the issue, in-case anyone else needs it.
There's also a cassandra distribution bundled with the patch
[here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
to make life a little easier for anyone finding themselves in such a bad
situation.

was:
Today SSTableReader needs at minimum 3 files to load an sstable:
- -Data.db
- -CompressionInfo.db
- -Index.db

So the following patch makes it possible for the StandaloneScrubber
(sstablescrub) to recover sstables despite missing -Index.db files.

Otherwise the patch can live with the issue, in-case anyone else needs it. I've
uploaded a cassandra distribution bundled with the patch as well to make life a
little easier for anyone finding themselves in such a bad situation.

Scrub (recover) sstables even when -Index.db is missing
---

Key: CASSANDRA-9591
URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
Project: Cassandra
Issue Type: Improvement
Reporter: mck
Assignee: mck
Labels: sstablescrub
Fix For: 2.0.15

Attachments: 9591-2.0.txt

Today SSTableReader needs at minimum 3 files to load an sstable:
- -Data.db
- -CompressionInfo.db
- -Index.db
But during the scrub process the -Index.db file isn't actually necessary,
unless there's corruption in the -Data.db and we want to be able to skip over
corrupted rows. Given that there is still a fair chance that there's nothing
wrong with the -Data.db file and we're just missing the -Index.db file this
patch addresses that situation.
So the following patch makes it possible for the StandaloneScrubber
(sstablescrub) to recover sstables despite missing -Index.db files.
This can happen from a catastrophic incident where data directories have been
lost and/or corrupted, or wiped and the backup not healthy. I'm aware that
normally one depends on replicas or snapshots to avoid such situations, but
such catastrophic incidents do occur in the wild.
I have not tested this patch against normal c* operations and all the other
(more critical) ways SSTableReader is used. i'll happily do that and add the
needed units tests if people see merit in accepting the patch.
Otherwise the patch can live with the issue, in-case anyone else needs it.
There's also a cassandra distribution bundled with the patch
[here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
to make life a little easier for anyone finding themselves in such a bad

[jira] [Updated] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-7918:

Attachment: reads.svg

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583829#comment-14583829
 ] 

Benedict edited comment on CASSANDRA-7918 at 6/12/15 6:19 PM:
--

FTR, gnuplot _does_ (apparently) work on Windows :)

edit: to avoid hunting around inside CASSANDRA-7282, I've uploaded the read 
comparison graph to this ticket


was (Author: benedict):
FTR, gnuplot _does_ (apparently) work on Windows :)

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-5322) Make dtest logging more granular

2015-06-12 Thread Steve Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Wang resolved CASSANDRA-5322.
---
   Resolution: Fixed
Fix Version/s: (was: 3.x)
   2.2.x
   2.1.x
 Reviewer: Philip Thompson
Reproduced In: 2.1.5

Modified ccmlib/cluster.py, ccmlib/common.py, and cassandra-dtest/dtest.py. 

I modified the dtest environment variables DEBUG and TRACE so that they could 
not only accept true/yes and false/no, but also names of C* classes (can add 
multiple by separating them with a colon). I did this using three functions: 
var_debug, var_trace, and modify_log. The first two change the log_level of the 
cluster for a specific class (If it's that is the case), and modify_log calls 
all the potential changes to the log_level's all at once. 

In cluster.py, I modified the add and set_log_level functions, and also added 
two global arrays, _debug and _trace. The two global variables serve to keep 
track of what classes have the respective log levels. In the set_log_level 
function, we check if there is a class_name being inputted, and if there is, we 
make sure it's not already being called. We then append the class to the 
respective global array, and then change the log_level on the node level. In 
the add function, I added a feature that whenever a node is added, it'll 
automatically take in the settings already set forth for class logging levels. 

Finally, in common.py, I modified the replaces_or_add_into_file_tail function. 
Before, all additional modifications would be written on the very last line of 
the file after the closing tag, which means it wasn't being read. This includes 
modifications to the log level. I changed it so that it would be added before 
the closing tag.

 Make dtest logging more granular 
 -

 Key: CASSANDRA-5322
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5322
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Steve Wang
 Fix For: 2.1.x, 2.2.x


 From Brandon: We need a way (might need to go in ccm, I haven't looked) to 
 just set one class to DEBUG or TRACE, like we'd do in 
 conf/log4-server.properties but with an env var preferably, so I can control 
 it via buildbot, since it's better at reproducing some issues than I am 
 sometimes, but I don't want to run the full hammer debug all the time. Also, 
 a way to set Tester.allow_log_errors to false via an env var, since sometimes 
 there's an error there that takes a while to fix but is cosmetic, and in the 
 meantime I want to catch new failures so we don't fall behind.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583853#comment-14583853
 ] 

Benedict commented on CASSANDRA-7918:
-

It's worth pointing out that the user doesn't have to ever touch gnuplot; it 
compiles scripts for gnuplot, and shells out itself. 

I don't have any specific attachment to it, though, and if we can get the same 
info via some other means I'm thrilled. My _ideal_ world would be one with 
graphs akin to those I produced with gnuplot, but in javascript, with 
interactive buttons _most especially_ for turning on/off certain aspects of the 
graph, so that they can more easily be viewed. For instance, adding/removing 
specific branches, or latency bands.

I think stress should output all of the settings it receives if {{-log 
level=verbose}} is provided. However I'm not sure we want to tightly couple 
stress to the cassandra.yaml or the SHA. The approach I took was to parse a 
stress output, so if we standardise our performance tests to always run stress 
in verbose mode, the output file can become the canonical source of truth, and 
the graph generated on the fly. Perhaps we can SHA the output file, and store 
it in its entirety somewhere, inside a zip containing the cassandra.yaml, so 
that the graph can just contain this hash of the output file to route us to the 
permanent record?

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch, reads.svg


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Add collections, tuple, and UDT to JSON type documentation

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 271c9e4ac - a5be8f199


Add collections, tuple, and UDT to JSON type documentation

These were accidentally omitted when the docs were first written for
CASSANDRA-7970.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5be8f19
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5be8f19
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5be8f19

Branch: refs/heads/cassandra-2.2
Commit: a5be8f199150c5f7a7ad9df3babad1bf950dd4b3
Parents: 271c9e4
Author: Tyler Hobbs tylerlho...@gmail.com
Authored: Fri Jun 12 13:39:44 2015 -0500
Committer: Tyler Hobbs tylerlho...@gmail.com
Committed: Fri Jun 12 13:39:44 2015 -0500

--
 doc/cql3/CQL.textile | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5be8f19/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index 3755a2d..69c6032 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -1972,10 +1972,15 @@ Where possible, Cassandra will represent and accept 
data types in their native @
 |@float@|integer, float, string|float|String must be valid 
integer or float|
 |@inet@ |string|string   |IPv4 or IPv6 address|
 |@int@  |integer, string   |integer  |String must be valid 32 
bit integer|
+|@list@ |list, string  |list |Uses JSON's native list 
representation|
+|@map@  |map, string   |map  |Uses JSON's native map 
representation|
+|@set@  |list, string  |list |Uses JSON's native list 
representation|
 |@text@ |string|string   |Uses JSON's @\u@ 
character escape|
 |@time@ |string|string   |Time of day in format 
@HH-MM-SS[.f]@|
 |@timestamp@|integer, string   |string   |A timestamp. Strings 
constant are allow to input timestamps as dates, see Working with 
dates:#usingdates below for more information.  Datestamps with format 
@-MM-DD HH:MM:SS.SSS@ are returned.|
 |@timeuuid@ |string|string   |Type 1 UUID. See 
Constants:#constants for the UUID format|
+|@tuple@|list, string  |list |Uses JSON's native list 
representation|
+|@UDT@  |map, string   |map  |Uses JSON's native map 
representation with field names as keys|
 |@uuid@ |string|string   |See 
Constants:#constants for the UUID format|
 |@varchar@  |string|string   |Uses JSON's @\u@ 
character escape|
 |@varint@   |integer, string   |integer  |Variable length; may 
overflow 32 or 64 bit integers in client-side decoder|

[1/2] cassandra git commit: Add collections, tuple, and UDT to JSON type documentation

Repository: cassandra
Updated Branches:
  refs/heads/trunk c2e54ddc3 - 40c3e8922


Add collections, tuple, and UDT to JSON type documentation

These were accidentally omitted when the docs were first written for
CASSANDRA-7970.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a5be8f19
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a5be8f19
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a5be8f19

Branch: refs/heads/trunk
Commit: a5be8f199150c5f7a7ad9df3babad1bf950dd4b3
Parents: 271c9e4
Author: Tyler Hobbs tylerlho...@gmail.com
Authored: Fri Jun 12 13:39:44 2015 -0500
Committer: Tyler Hobbs tylerlho...@gmail.com
Committed: Fri Jun 12 13:39:44 2015 -0500

--
 doc/cql3/CQL.textile | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a5be8f19/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index 3755a2d..69c6032 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -1972,10 +1972,15 @@ Where possible, Cassandra will represent and accept 
data types in their native @
 |@float@|integer, float, string|float|String must be valid 
integer or float|
 |@inet@ |string|string   |IPv4 or IPv6 address|
 |@int@  |integer, string   |integer  |String must be valid 32 
bit integer|
+|@list@ |list, string  |list |Uses JSON's native list 
representation|
+|@map@  |map, string   |map  |Uses JSON's native map 
representation|
+|@set@  |list, string  |list |Uses JSON's native list 
representation|
 |@text@ |string|string   |Uses JSON's @\u@ 
character escape|
 |@time@ |string|string   |Time of day in format 
@HH-MM-SS[.f]@|
 |@timestamp@|integer, string   |string   |A timestamp. Strings 
constant are allow to input timestamps as dates, see Working with 
dates:#usingdates below for more information.  Datestamps with format 
@-MM-DD HH:MM:SS.SSS@ are returned.|
 |@timeuuid@ |string|string   |Type 1 UUID. See 
Constants:#constants for the UUID format|
+|@tuple@|list, string  |list |Uses JSON's native list 
representation|
+|@UDT@  |map, string   |map  |Uses JSON's native map 
representation with field names as keys|
 |@uuid@ |string|string   |See 
Constants:#constants for the UUID format|
 |@varchar@  |string|string   |Uses JSON's @\u@ 
character escape|
 |@varint@   |integer, string   |integer  |Variable length; may 
overflow 32 or 64 bit integers in client-side decoder|

[2/2] cassandra git commit: Merge branch 'cassandra-2.2' into trunk

Merge branch 'cassandra-2.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/40c3e892
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/40c3e892
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/40c3e892

Branch: refs/heads/trunk
Commit: 40c3e892291315dd1531159f5dc5f51e74bb1ac2
Parents: c2e54dd a5be8f1
Author: Tyler Hobbs tylerlho...@gmail.com
Authored: Fri Jun 12 13:40:56 2015 -0500
Committer: Tyler Hobbs tylerlho...@gmail.com
Committed: Fri Jun 12 13:40:56 2015 -0500

--
 doc/cql3/CQL.textile | 5 +
 1 file changed, 5 insertions(+)
--

[jira] [Commented] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

2015-06-12 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809
 ] 

Jeff Jirsa commented on CASSANDRA-8460:
---

{quote}yes, I've been thinking maybe adding priorities or tags to the data 
directories, but that is probably not needed now. Adding a flag to each 
data_directory that states whether it is for archival storage or not is 
probably enough for now.{quote}

Asking for clarification to make sure I don't go too far into pony land:

So my initial approach was to define a second config item, separate from 
{{data_file_directories}} entirely, so that no other code needed to be aware of 
it except for classes explicitly wanting to use `archive` tier storage ( 
{{dd.getAllDataFileLocations()}} would not return the archive tier, but rather 
add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of 
storage).  

It sounds from your description you're envisioning changing the list of 
data_file_locations to a list of maps {noformat} 
[tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} 
tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also 
need to maintain backwards compatibility, which seems fairly straight forward 
to do (check to see if the provided {{data_files_directory}} is an old-format 
list rather than map and apply some default tag?)

The first approach is clean and isolated, unlikely to introduce surprises, but 
potentially limits us from being able to do more interesting work with tagged 
data file directories later (ie: only store data for KS W in data directories 
tagged X, and KS Y in data directories tagged Z). Can you clarify which best 
fits your expectations? 


 Make it possible to move non-compacting sstables to slow/big storage in DTCS
 

 Key: CASSANDRA-8460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
  Labels: dtcs

 It would be nice if we could configure DTCS to have a set of extra data 
 directories where we move the sstables once they are older than 
 max_sstable_age_days. 
 This would enable users to have a quick, small SSD for hot, new data, and big 
 spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8460) Make it possible to move non-compacting sstables to slow/big storage in DTCS

2015-06-12 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583809#comment-14583809
 ] 

Jeff Jirsa edited comment on CASSANDRA-8460 at 6/12/15 6:07 PM:


{quote}yes, I've been thinking maybe adding priorities or tags to the data 
directories, but that is probably not needed now. Adding a flag to each 
data_directory that states whether it is for archival storage or not is 
probably enough for now.{quote}

Asking for clarification to make sure I don't go too far into pony land:

So my initial approach was to define a second config item, separate from 
{{data_file_directories}} entirely, so that no other code needed to be aware of 
it except for classes explicitly wanting to use `archive` tier storage ( 
{{dd.getAllDataFileLocations()}} would not return the archive tier, but rather 
add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of 
storage).  

It sounds from your description you're envisioning changing the list of 
data_file_locations to a map {noformat} 
[tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} 
tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also 
need to maintain backwards compatibility, which seems fairly straight forward 
to do (check to see if the provided {{data_files_directory}} is an old-format 
list rather than map and apply some default tag?)

The first approach is clean and isolated, unlikely to introduce surprises, but 
potentially limits us from being able to do more interesting work with tagged 
data file directories later (ie: only store data for KS W in data directories 
tagged X, and KS Y in data directories tagged Z). Can you clarify which best 
fits your expectations? 



was (Author: jjirsa):
{quote}yes, I've been thinking maybe adding priorities or tags to the data 
directories, but that is probably not needed now. Adding a flag to each 
data_directory that states whether it is for archival storage or not is 
probably enough for now.{quote}

Asking for clarification to make sure I don't go too far into pony land:

So my initial approach was to define a second config item, separate from 
{{data_file_directories}} entirely, so that no other code needed to be aware of 
it except for classes explicitly wanting to use `archive` tier storage ( 
{{dd.getAllDataFileLocations()}} would not return the archive tier, but rather 
add a {{dd.getArchiveDataFileLocations()}} specifically for the slow class of 
storage).  

It sounds from your description you're envisioning changing the list of 
data_file_locations to a list of maps {noformat} 
[tag1:location1,tag1:location2,tag3:location3] {noformat} or {noformat} 
tag1:[location1,location2],tag3:[location3] {noformat} In this case, we'd also 
need to maintain backwards compatibility, which seems fairly straight forward 
to do (check to see if the provided {{data_files_directory}} is an old-format 
list rather than map and apply some default tag?)

The first approach is clean and isolated, unlikely to introduce surprises, but 
potentially limits us from being able to do more interesting work with tagged 
data file directories later (ie: only store data for KS W in data directories 
tagged X, and KS Y in data directories tagged Z). Can you clarify which best 
fits your expectations? 


 Make it possible to move non-compacting sstables to slow/big storage in DTCS
 

 Key: CASSANDRA-8460
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8460
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
  Labels: dtcs

 It would be nice if we could configure DTCS to have a set of extra data 
 directories where we move the sstables once they are older than 
 max_sstable_age_days. 
 This would enable users to have a quick, small SSD for hot, new data, and big 
 spinning disks for data that is rarely read and never compacted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583829#comment-14583829
 ] 

Benedict commented on CASSANDRA-7918:
-

FTR, gnuplot _does_ (apparently) work on Windows :)

 Provide graphing tool along with cassandra-stress
 -

 Key: CASSANDRA-7918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Ryan McGuire
Priority: Minor
 Attachments: 7918.patch


 Whilst cstar makes some pretty graphs, they're a little limited and also 
 require you to run your tests through it. It would be useful to be able to 
 graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress