[jira] [Created] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Robert Stupp (JIRA)
Robert Stupp created CASSANDRA-13443:


 Summary: V5 protocol flags decoding broken
 Key: CASSANDRA-13443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor


Since native protocol version 5 we deserialize the flags in 
{{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
{code}
EnumSet flags = 
Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
   ? (int)body.readUnsignedInt()
   : (int)body.readByte());
{code}

This works until the highest bit (0x80) is not used. {{readByte}} must be 
changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-13443:
-
Status: Patch Available  (was: Open)

||cassandra-3.11|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...snazy:13443-proto-flags-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-13443-proto-flags-3.11-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-13443-proto-flags-3.11-dtest/lastSuccessfulBuild/]
||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:13443-proto-flags-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-13443-proto-flags-trunk-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-13443-proto-flags-trunk-dtest/lastSuccessfulBuild/]


> V5 protocol flags decoding broken
> -
>
> Key: CASSANDRA-13443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> Since native protocol version 5 we deserialize the flags in 
> {{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
> {code}
> EnumSet flags = 
> Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
>? 
> (int)body.readUnsignedInt()
>: (int)body.readByte());
> {code}
> This works until the highest bit (0x80) is not used. {{readByte}} must be 
> changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-13443:
-
Fix Version/s: 4.0
   3.11.0

> V5 protocol flags decoding broken
> -
>
> Key: CASSANDRA-13443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.11.0, 4.0
>
>
> Since native protocol version 5 we deserialize the flags in 
> {{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
> {code}
> EnumSet flags = 
> Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
>? 
> (int)body.readUnsignedInt()
>: (int)body.readByte());
> {code}
> This works until the highest bit (0x80) is not used. {{readByte}} must be 
> changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2017-04-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967230#comment-15967230
 ] 

Robert Stupp commented on CASSANDRA-10145:
--

[~jjirsa] what happened to your comment?

> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
>  Labels: client-impacting, protocolv5
> Fix For: 4.0
>
> Attachments: 10145-trunk.txt
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-13443:
-
Reviewer: Stefania

> V5 protocol flags decoding broken
> -
>
> Key: CASSANDRA-13443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.11.0, 4.0
>
>
> Since native protocol version 5 we deserialize the flags in 
> {{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
> {code}
> EnumSet flags = 
> Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
>? 
> (int)body.readUnsignedInt()
>: (int)body.readByte());
> {code}
> This works until the highest bit (0x80) is not used. {{readByte}} must be 
> changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967251#comment-15967251
 ] 

Stefania commented on CASSANDRA-13443:
--

Nice catch, +1, provided tests pass.

> V5 protocol flags decoding broken
> -
>
> Key: CASSANDRA-13443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.11.0, 4.0
>
>
> Since native protocol version 5 we deserialize the flags in 
> {{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
> {code}
> EnumSet flags = 
> Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
>? 
> (int)body.readUnsignedInt()
>: (int)body.readByte());
> {code}
> This works until the highest bit (0x80) is not used. {{readByte}} must be 
> changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()

2017-04-13 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967256#comment-15967256
 ] 

Corentin Chary commented on CASSANDRA-13432:


Tried the patch, setting the tombstone threshold to one:
{code}
WARN  [SharedPool-Worker-4] 2017-04-13 09:51:55,894 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-4,10,main]: {}
java.lang.RuntimeException: 
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2249)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[main/:na]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$TraceSessionFutureTask.run(AbstractTracingAwareExecutorService.java:136)
 [main/:na]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[main/:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
at 
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:202) 
~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:163) 
~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:263)
 ~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:114) 
~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:88)
 ~[main/:na]
at 
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:99)
 ~[main/:na]
at 
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:71)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:117)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:2115)
 ~[main/:na]
at 
org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:2111)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2266) 
~[main/:na]
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2224)
 ~[main/:na]
at 
org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:115)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1572)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2246)
 ~[main/:na]
... 5 common frames omitted
{code}

> MemtableReclaimMemory can get stuck because of lack of timeout in 
> getTopLevelColumns()
> --
>
> Key: CASSANDRA-13432
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13432
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra 2.1.15
>Reporter: Corentin Chary
> Fix For: 2.1.x
>
>
> This might affect 3.x too, I'm not sure.
> {code}
> $ nodetool tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   32135875 0
>  0
> ReadStage   114 0   29492940 0
>

[jira] [Comment Edited] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()

2017-04-13 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967256#comment-15967256
 ] 

Corentin Chary edited comment on CASSANDRA-13432 at 4/13/17 7:54 AM:
-

Tried the patch, setting the tombstone threshold to one:
{code}
ERROR [SharedPool-Worker-4] 2017-04-13 09:51:55,891 QueryFilter.java:201 - 
Scanned over 1 tombstones in system.size_estimates for key: unknown; query 
aborted (see tombstone_failure_threshold).
WARN  [SharedPool-Worker-4] 2017-04-13 09:51:55,894 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-4,10,main]: {}
java.lang.RuntimeException: 
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2249)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[main/:na]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$TraceSessionFutureTask.run(AbstractTracingAwareExecutorService.java:136)
 [main/:na]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[main/:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
at 
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:202) 
~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:163) 
~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:263)
 ~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:114) 
~[main/:na]
at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:88)
 ~[main/:na]
at 
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:99)
 ~[main/:na]
at 
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:71)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:117)
 ~[main/:na]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:2115)
 ~[main/:na]
at 
org.apache.cassandra.db.ColumnFamilyStore$9.computeNext(ColumnFamilyStore.java:2111)
 ~[main/:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2266) 
~[main/:na]
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2224)
 ~[main/:na]
at 
org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:115)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1572)
 ~[main/:na]
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2246)
 ~[main/:na]
... 5 common frames omitted
{code}


was (Author: iksaif):
Tried the patch, setting the tombstone threshold to one:
{code}
WARN  [SharedPool-Worker-4] 2017-04-13 09:51:55,894 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-4,10,main]: {}
java.lang.RuntimeException: 
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2249)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_121]
   

[jira] [Created] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread Fuud (JIRA)
Fuud created CASSANDRA-13444:


 Summary: Fast and garbage-free Streaming Histogram
 Key: CASSANDRA-13444
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13444
 Project: Cassandra
  Issue Type: Improvement
  Components: Compaction
Reporter: Fuud


StreamingHistogram is cause of high cpu usage and GC pressure.
It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
accumulate different values into the big map before merging them into smaller 
one.

But there was not enought for TTL's distributed within large time. Rounding 
(also introduced at 13038) can help but it reduce histogram precision specially 
in case where TTL's does not distributed uniformly.

There are several improvements that can help to reduce cpu and gc usage. Them 
all included in the pool-request as separate revisions thus you can test them 
independently.

Improvements list:
# Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
"add-or-accumulate" operation takes one map operation instead of two. But this 
method (default-defined in interface Map) is overriden in HashMap but not in 
TreeMap. Thus I changed spool type to HashMap.
# As we round incoming values to _roundSeconds_ we can also round value on 
merge. It will enlarge hit rate for bin operations.
# Because we inserted only integers into Histogram and rounding values to 
integers we can use *int* type everywhere.
# Histogram takes huge amount of time merging values. In merge method largest 
amount of time taken by finding nearest points. It can be eliminated by holding 
additional TreeSet with differences, sorted from smalest to greatest.
# Because we know max size of _bin_ and _differences_ maps we can replace them 
with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
some operations into one.
# Because spool map is also limited we can replace it with open address 
primitive map. It's finaly reduce allocation rate to zero.

You can see gain given by each step in the attached file. First number is time 
for one benchmark invocation and second - is allocation rate in Mb per 
operation.

Overall gain:

|.|.|Payload/SpoolSize|.|.|.|% from original
|.|.|.|original|.|optimized|
|.|.|secondInMonth/0|.|.|.|
|time ms/op|.|.|10747,684|.|5545,063|51,6
|allocation Mb/op|.|.|2441,38858|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondInMonth/1000|.|.|.|
|time ms/op|.|.|8988,578|.|5791,179|64,4
|allocation Mb/op|.|.|2440,951141|.|0,017715454|0
|.|.|.|.|.|.|
|.|.|secondInMonth/1|.|.|.|
|time ms/op|.|.|10711,671|.|5765,243|53,8
|allocation Mb/op|.|.|2437,022537|.|0,264083862|0
|.|.|.|.|.|.|
|.|.|secondInMonth/10|.|.|.|
|time ms/op|.|.|13001,841|.|5638,069|43,4
|allocation Mb/op|.|.|2396,947113|.|2,003662109|0,1
|.|.|.|.|.|.|
|.|.|secondInDay/0|.|.|.|
|time ms/op|.|.|10381,833|.|5497,804|53
|allocation Mb/op|.|.|2441,166107|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondInDay/1000|.|.|.|
|time ms/op|.|.|8522,157|.|5929,871|69,6
|allocation Mb/op|.|.|1973,112381|.|0,017715454|0
|.|.|.|.|.|.|
|.|.|secondInDay/1|.|.|.|
|time ms/op|.|.|10234,978|.|5480,077|53,5
|allocation Mb/op|.|.|2306,057404|.|0,262969971|0
|.|.|.|.|.|.|
|.|.|secondInDay/10|.|.|.|
|time ms/op|.|.|2971,178|.|139,079|4,7
|allocation Mb/op|.|.|172,1276245|.|2,001721191|1,2
|.|.|.|.|.|.|
|.|.|secondIn3Hour/0|.|.|.|
|time ms/op|.|.|10663,123|.|5605,672|52,6
|allocation Mb/op|.|.|2439,456818|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/1000|.|.|.|
|time ms/op|.|.|9029,788|.|5838,618|64,7
|allocation Mb/op|.|.|2331,839249|.|0,180664063|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/1|.|.|.|
|time ms/op|.|.|4862,409|.|89,001|1,8
|allocation Mb/op|.|.|965,4871887|.|0,251711652|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/10|.|.|.|
|time ms/op|.|.|1484,454|.|95,044|6,4
|allocation Mb/op|.|.|153,2464722|.|2,001712809|1,3
|.|.|.|.|.|.|
|.|.|secondInMin/0|.|.|.|
|time ms/op|.|.|875,118|.|424,11|48,5
|allocation Mb/op|.|.|610,3554993|.|0,001776123|0
|.|.|.|.|.|.|
|.|.|secondInMin/1000|.|.|.|
|time ms/op|.|.|568,7|.|84,208|14,8
|allocation Mb/op|.|.|0,007598114|.|0,01810023|238,2
|.|.|.|.|.|.|
|.|.|secondInMin/1|.|.|.|
|time ms/op|.|.|573,595|.|83,862|14,6
|allocation Mb/op|.|.|0,007597351|.|0,252473872|3323,2
|.|.|.|.|.|.|
|.|.|secondInMin/10|.|.|.|
|time ms/op|.|.|584,457|.|86,554|14,8
|allocation Mb/op|.|.|0,007595825|.|2,002506106|26363,2

You may notice increased allocation rate for secondInMin payload. It is because 
test use small values and Integer.valueOf use cache for them. In real case 
where incoming values will be timestamps this cache will not work. Also 
constant memory 2 Mb per StreamingHistogram is quite good.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967281#comment-15967281
 ] 

ASF GitHub Bot commented on CASSANDRA-13444:


GitHub user Fuud opened a pull request:

https://github.com/apache/cassandra/pull/106

Fast streaming hist

PR for CASSANDRA-13444
https://issues.apache.org/jira/browse/CASSANDRA-13444

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Fuud/cassandra fast-streaming-hist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra/pull/106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #106


commit 3d7ce27aa9bc28c792502ffa5732e5e606e305b9
Author: Fedor Bobin 
Date:   2017-04-12T05:10:03Z

Benchmark refactoring

commit 5a87a7f3d08c582b690e7e6718a07e3b843cc108
Author: Fedor Bobin 
Date:   2017-04-12T05:40:20Z

computeIfAbsent instead of get->put chain

commit 64438350d640489bd8891305709253c7ee0d9fb6
Author: Fedor Bobin 
Date:   2017-04-12T06:09:45Z

fast streaming histogram: round on merge

commit d99ea1afcc7791637873d4cd71afe6d08fb69f01
Author: Fedor Bobin 
Date:   2017-04-12T06:40:56Z

fast stream histogram: explicitly same type everywhere: Integer instead of 
Number{Integer, Double}

commit 2f7c3ac64305c7e792ff910d4e424a3f4aff7d15
Author: Fedor Bobin 
Date:   2017-04-12T07:14:03Z

Use Set with distances to eliminate full-scan when merging

commit c6ea73429dcc1f4efd0409709241a79d8442e7eb
Author: Fedor Bobin 
Date:   2017-04-12T10:19:15Z

fast stream histogram: Use sorted arrays instead of Sets.

commit 95c238ccaadc27f51b29a03ee85e301c6ebcd4c1
Author: Fedor Bobin 
Date:   2017-04-13T07:17:26Z

fast stream histogram: Use open address primitive map as spool.




> Fast and garbage-free Streaming Histogram
> -
>
> Key: CASSANDRA-13444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Fuud
>
> StreamingHistogram is cause of high cpu usage and GC pressure.
> It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
> accumulate different values into the big map before merging them into smaller 
> one.
> But there was not enought for TTL's distributed within large time. Rounding 
> (also introduced at 13038) can help but it reduce histogram precision 
> specially in case where TTL's does not distributed uniformly.
> There are several improvements that can help to reduce cpu and gc usage. Them 
> all included in the pool-request as separate revisions thus you can test them 
> independently.
> Improvements list:
> # Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
> "add-or-accumulate" operation takes one map operation instead of two. But 
> this method (default-defined in interface Map) is overriden in HashMap but 
> not in TreeMap. Thus I changed spool type to HashMap.
> # As we round incoming values to _roundSeconds_ we can also round value on 
> merge. It will enlarge hit rate for bin operations.
> # Because we inserted only integers into Histogram and rounding values to 
> integers we can use *int* type everywhere.
> # Histogram takes huge amount of time merging values. In merge method largest 
> amount of time taken by finding nearest points. It can be eliminated by 
> holding additional TreeSet with differences, sorted from smalest to greatest.
> # Because we know max size of _bin_ and _differences_ maps we can replace 
> them with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
> insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
> some operations into one.
> # Because spool map is also limited we can replace it with open address 
> primitive map. It's finaly reduce allocation rate to zero.
> You can see gain given by each step in the attached file. First number is 
> time for one benchmark invocation and second - is allocation rate in Mb per 
> operation.
> Overall gain:
> |.|.|Payload/SpoolSize|.|.|.|% from original
> |.|.|.|original|.|optimized|
> |.|.|secondInMonth/0|.|.|.|
> |time ms/op|.|.|10747,684|.|5545,063|51,6
> |allocation Mb/op|.|.|2441,38858|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1000|.|.|.|
> |time ms/op|.|.|8988,578|.|5791,179|64,4
> |allocation Mb/op|.|.|2440,951141|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1|.|.|.|
> |time ms/op|.|.|10711,671|.|5765,243|53,8
> |allocation Mb/op|.|.|2437,022537|.|0,264083862|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/10|.|.|.|
> |time ms/op|.|.|13001,841|.|5638,069|43,4
> |allocation Mb/op|.|.|2396,947113|.|2,0036

[jira] [Updated] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread Fuud (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuud updated CASSANDRA-13444:
-
Attachment: results.xlsx
results.csv

> Fast and garbage-free Streaming Histogram
> -
>
> Key: CASSANDRA-13444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Fuud
> Attachments: results.csv, results.xlsx
>
>
> StreamingHistogram is cause of high cpu usage and GC pressure.
> It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
> accumulate different values into the big map before merging them into smaller 
> one.
> But there was not enought for TTL's distributed within large time. Rounding 
> (also introduced at 13038) can help but it reduce histogram precision 
> specially in case where TTL's does not distributed uniformly.
> There are several improvements that can help to reduce cpu and gc usage. Them 
> all included in the pool-request as separate revisions thus you can test them 
> independently.
> Improvements list:
> # Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
> "add-or-accumulate" operation takes one map operation instead of two. But 
> this method (default-defined in interface Map) is overriden in HashMap but 
> not in TreeMap. Thus I changed spool type to HashMap.
> # As we round incoming values to _roundSeconds_ we can also round value on 
> merge. It will enlarge hit rate for bin operations.
> # Because we inserted only integers into Histogram and rounding values to 
> integers we can use *int* type everywhere.
> # Histogram takes huge amount of time merging values. In merge method largest 
> amount of time taken by finding nearest points. It can be eliminated by 
> holding additional TreeSet with differences, sorted from smalest to greatest.
> # Because we know max size of _bin_ and _differences_ maps we can replace 
> them with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
> insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
> some operations into one.
> # Because spool map is also limited we can replace it with open address 
> primitive map. It's finaly reduce allocation rate to zero.
> You can see gain given by each step in the attached file. First number is 
> time for one benchmark invocation and second - is allocation rate in Mb per 
> operation.
> Overall gain:
> |.|.|Payload/SpoolSize|.|.|.|% from original
> |.|.|.|original|.|optimized|
> |.|.|secondInMonth/0|.|.|.|
> |time ms/op|.|.|10747,684|.|5545,063|51,6
> |allocation Mb/op|.|.|2441,38858|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1000|.|.|.|
> |time ms/op|.|.|8988,578|.|5791,179|64,4
> |allocation Mb/op|.|.|2440,951141|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1|.|.|.|
> |time ms/op|.|.|10711,671|.|5765,243|53,8
> |allocation Mb/op|.|.|2437,022537|.|0,264083862|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/10|.|.|.|
> |time ms/op|.|.|13001,841|.|5638,069|43,4
> |allocation Mb/op|.|.|2396,947113|.|2,003662109|0,1
> |.|.|.|.|.|.|
> |.|.|secondInDay/0|.|.|.|
> |time ms/op|.|.|10381,833|.|5497,804|53
> |allocation Mb/op|.|.|2441,166107|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1000|.|.|.|
> |time ms/op|.|.|8522,157|.|5929,871|69,6
> |allocation Mb/op|.|.|1973,112381|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1|.|.|.|
> |time ms/op|.|.|10234,978|.|5480,077|53,5
> |allocation Mb/op|.|.|2306,057404|.|0,262969971|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/10|.|.|.|
> |time ms/op|.|.|2971,178|.|139,079|4,7
> |allocation Mb/op|.|.|172,1276245|.|2,001721191|1,2
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/0|.|.|.|
> |time ms/op|.|.|10663,123|.|5605,672|52,6
> |allocation Mb/op|.|.|2439,456818|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1000|.|.|.|
> |time ms/op|.|.|9029,788|.|5838,618|64,7
> |allocation Mb/op|.|.|2331,839249|.|0,180664063|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1|.|.|.|
> |time ms/op|.|.|4862,409|.|89,001|1,8
> |allocation Mb/op|.|.|965,4871887|.|0,251711652|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/10|.|.|.|
> |time ms/op|.|.|1484,454|.|95,044|6,4
> |allocation Mb/op|.|.|153,2464722|.|2,001712809|1,3
> |.|.|.|.|.|.|
> |.|.|secondInMin/0|.|.|.|
> |time ms/op|.|.|875,118|.|424,11|48,5
> |allocation Mb/op|.|.|610,3554993|.|0,001776123|0
> |.|.|.|.|.|.|
> |.|.|secondInMin/1000|.|.|.|
> |time ms/op|.|.|568,7|.|84,208|14,8
> |allocation Mb/op|.|.|0,007598114|.|0,01810023|238,2
> |.|.|.|.|.|.|
> |.|.|secondInMin/1|.|.|.|
> |time ms/op|.|.|573,595|.|83,862|14,6
> |allocation Mb/op|.|.|0,007597351|.|0,252473872|3323,2
> |.|.|.|.|.|.|
> |.|.|secondInMin/10|.|.|.|
> |time ms/op|.|.|584,457|.|86,554|14,8
> |allocation Mb/op|.|.|0,007595825|.|2,002506106|

[jira] [Updated] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread Fuud (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuud updated CASSANDRA-13444:
-
Description: 
StreamingHistogram is cause of high cpu usage and GC pressure.
It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
accumulate different values into the big map before merging them into smaller 
one.

But there was not enought for TTL's distributed within large time. Rounding 
(also introduced at 13038) can help but it reduce histogram precision specially 
in case where TTL's does not distributed uniformly.

There are several improvements that can help to reduce cpu and gc usage. Them 
all included in the pool-request as separate revisions thus you can test them 
independently.

Improvements list:
# Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
"add-or-accumulate" operation takes one map operation instead of two. But this 
method (default-defined in interface Map) is overriden in HashMap but not in 
TreeMap. Thus I changed spool type to HashMap.
# As we round incoming values to _roundSeconds_ we can also round value on 
merge. It will enlarge hit rate for bin operations.
# Because we inserted only integers into Histogram and rounding values to 
integers we can use *int* type everywhere.
# Histogram takes huge amount of time merging values. In merge method largest 
amount of time taken by finding nearest points. It can be eliminated by holding 
additional TreeSet with differences, sorted from smalest to greatest.
# Because we know max size of _bin_ and _differences_ maps we can replace them 
with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
some operations into one.
# Because spool map is also limited we can replace it with open address 
primitive map. It's finaly reduce allocation rate to zero.

You can see gain given by each step in the attached file. First number is time 
for one benchmark invocation and second - is allocation rate in Mb per 
operation.

Dependends of payload time is reduced up to 90%.

Overall gain:

|.|.|Payload/SpoolSize|.|.|.|% from original
|.|.|.|original|.|optimized|
|.|.|secondInMonth/0|.|.|.|
|time ms/op|.|.|10747,684|.|5545,063|51,6
|allocation Mb/op|.|.|2441,38858|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondInMonth/1000|.|.|.|
|time ms/op|.|.|8988,578|.|5791,179|64,4
|allocation Mb/op|.|.|2440,951141|.|0,017715454|0
|.|.|.|.|.|.|
|.|.|secondInMonth/1|.|.|.|
|time ms/op|.|.|10711,671|.|5765,243|53,8
|allocation Mb/op|.|.|2437,022537|.|0,264083862|0
|.|.|.|.|.|.|
|.|.|secondInMonth/10|.|.|.|
|time ms/op|.|.|13001,841|.|5638,069|43,4
|allocation Mb/op|.|.|2396,947113|.|2,003662109|0,1
|.|.|.|.|.|.|
|.|.|secondInDay/0|.|.|.|
|time ms/op|.|.|10381,833|.|5497,804|53
|allocation Mb/op|.|.|2441,166107|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondInDay/1000|.|.|.|
|time ms/op|.|.|8522,157|.|5929,871|69,6
|allocation Mb/op|.|.|1973,112381|.|0,017715454|0
|.|.|.|.|.|.|
|.|.|secondInDay/1|.|.|.|
|time ms/op|.|.|10234,978|.|5480,077|53,5
|allocation Mb/op|.|.|2306,057404|.|0,262969971|0
|.|.|.|.|.|.|
|.|.|secondInDay/10|.|.|.|
|time ms/op|.|.|2971,178|.|139,079|4,7
|allocation Mb/op|.|.|172,1276245|.|2,001721191|1,2
|.|.|.|.|.|.|
|.|.|secondIn3Hour/0|.|.|.|
|time ms/op|.|.|10663,123|.|5605,672|52,6
|allocation Mb/op|.|.|2439,456818|.|0,002105713|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/1000|.|.|.|
|time ms/op|.|.|9029,788|.|5838,618|64,7
|allocation Mb/op|.|.|2331,839249|.|0,180664063|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/1|.|.|.|
|time ms/op|.|.|4862,409|.|89,001|1,8
|allocation Mb/op|.|.|965,4871887|.|0,251711652|0
|.|.|.|.|.|.|
|.|.|secondIn3Hour/10|.|.|.|
|time ms/op|.|.|1484,454|.|95,044|6,4
|allocation Mb/op|.|.|153,2464722|.|2,001712809|1,3
|.|.|.|.|.|.|
|.|.|secondInMin/0|.|.|.|
|time ms/op|.|.|875,118|.|424,11|48,5
|allocation Mb/op|.|.|610,3554993|.|0,001776123|0
|.|.|.|.|.|.|
|.|.|secondInMin/1000|.|.|.|
|time ms/op|.|.|568,7|.|84,208|14,8
|allocation Mb/op|.|.|0,007598114|.|0,01810023|238,2
|.|.|.|.|.|.|
|.|.|secondInMin/1|.|.|.|
|time ms/op|.|.|573,595|.|83,862|14,6
|allocation Mb/op|.|.|0,007597351|.|0,252473872|3323,2
|.|.|.|.|.|.|
|.|.|secondInMin/10|.|.|.|
|time ms/op|.|.|584,457|.|86,554|14,8
|allocation Mb/op|.|.|0,007595825|.|2,002506106|26363,2

You may notice increased allocation rate for secondInMin payload. It is because 
test use small values and Integer.valueOf use cache for them. In real case 
where incoming values will be timestamps this cache will not work. Also 
constant memory 2 Mb per StreamingHistogram is quite good.

  was:
StreamingHistogram is cause of high cpu usage and GC pressure.
It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
accumulate different values into the big map before merging them into smaller 
one.

But there was not enought for TTL's

[jira] [Commented] (CASSANDRA-13412) Update of column with TTL results in secondary index not returning row

2017-04-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967286#comment-15967286
 ] 

Andrés de la Peña commented on CASSANDRA-13412:
---

The problem is also affecting to indexes on partition key components:
{code}
CREATE TABLE k.t (
pk1 int,
pk2 int,
a int,
b int,
PRIMARY KEY ((pk1, pk2))
);
CREATE INDEX ON k.t(pk1);

INSERT INTO k.t (pk1, pk2, a, b) VALUES (1, 2, 3, 4);
UPDATE k.t USING TTL 10 SET b = 10 WHERE pk1 = 1 AND pk2 = 2;
-- Wait 10 seconds

SELECT * FROM k.t WHERE pk1 = 1 AND pk2 = 2; -- 1 row
SELECT * FROM k.t WHERE pk1 = 1; -- 0 rows
{code}

Index entries inherit the TTL of the indexed cell, and this is right with 
regular columns. However, indexes of primary key columns index every column, 
meaning that their index entries will always get the TTL of the last updated 
row column. 

The proposed solution is to not generate expiring cells when the indexed column 
is part of the primary key. Here is the patch for 2.1 and 2.2:

||[2.1|https://github.com/apache/cassandra/compare/cassandra-2.1...adelapena:13412-2.1]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13412-2.1-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13412-2.1-dtest/]|
||[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...adelapena:13412-2.2]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13412-2.2-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13412-2.2-dtest/]|


> Update of column with TTL results in secondary index not returning row
> --
>
> Key: CASSANDRA-13412
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13412
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Enrique Bautista Barahona
>Assignee: Andrés de la Peña
>
> Cassandra versions: 2.2.3, 3.0.11
> 1 datacenter, keyspace has RF 3. Default consistency level.
> Steps:
> 1. I create these table and index.
> {code}
> CREATE TABLE my_table (
> a text,
> b text,
> c text,
> d set,
> e float,
> f text,
> g int,
> h double,
> j set,
> k float,
> m set,
> PRIMARY KEY (a, b, c)
> ) WITH read_repair_chance = 0.0
>AND dclocal_read_repair_chance = 0.1
>AND gc_grace_seconds = 864000
>AND bloom_filter_fp_chance = 0.01
>AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' }
>AND comment = ''
>AND compaction = { 'class' : 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
>AND compression = { 'sstable_compression' : 
> 'org.apache.cassandra.io.compress.LZ4Compressor' }
>AND default_time_to_live = 0
>AND speculative_retry = '99.0PERCENTILE'
>AND min_index_interval = 128
>AND max_index_interval = 2048;
> CREATE INDEX my_index ON my_table (c);
> {code}
> 2. I have 9951 INSERT statements in a file and I run the following command to 
> execute them. The INSERT statements have no TTL and no consistency level is 
> specified.
> {code}
> cqlsh   -u  -f 
> {code}
> 3. I update a column filtering by the whole primary key, and setting a TTL. 
> For example:
> {code}
> UPDATE my_table USING TTL 30 SET h = 10 WHERE a = 'test_a' AND b = 'test_b' 
> AND c = 'test_c';
> {code}
> 4. After the time specified in the TTL I run the following queries:
> {code}
> SELECT * FROM my_table WHERE a = 'test_a' AND b = 'test_b' AND c = 'test_c';
> SELECT * FROM my_table WHERE c = 'test_c';
> {code}
> The first one returns the correct row with an empty h column (as it has 
> expired). However, the second query (which uses the secondary index on column 
> c) returns nothing.
> I've done the query through my app which uses the Java driver v3.0.4 and 
> reads with CL local_one, from the cql shell and from DBeaver 3.8.5. All 
> display the same behaviour. The queries are performed minutes after the 
> writes and the servers don't have a high load, so I think it's unlikely to be 
> a consistency issue.
> I've tried to reproduce the issue in ccm and cqlsh by creating a new keyspace 
> and table, and inserting just 1 row, and the bug doesn't manifest. This leads 
> me to think that it's an issue only present with not trivially small amounts 
> of data, or maybe present only after Cassandra compacts or performs whatever 
> maintenance it needs to do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread Fuud (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967288#comment-15967288
 ] 

Fuud commented on CASSANDRA-13444:
--

[~jjirsa] please review

> Fast and garbage-free Streaming Histogram
> -
>
> Key: CASSANDRA-13444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Fuud
> Attachments: results.csv, results.xlsx
>
>
> StreamingHistogram is cause of high cpu usage and GC pressure.
> It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
> accumulate different values into the big map before merging them into smaller 
> one.
> But there was not enought for TTL's distributed within large time. Rounding 
> (also introduced at 13038) can help but it reduce histogram precision 
> specially in case where TTL's does not distributed uniformly.
> There are several improvements that can help to reduce cpu and gc usage. Them 
> all included in the pool-request as separate revisions thus you can test them 
> independently.
> Improvements list:
> # Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
> "add-or-accumulate" operation takes one map operation instead of two. But 
> this method (default-defined in interface Map) is overriden in HashMap but 
> not in TreeMap. Thus I changed spool type to HashMap.
> # As we round incoming values to _roundSeconds_ we can also round value on 
> merge. It will enlarge hit rate for bin operations.
> # Because we inserted only integers into Histogram and rounding values to 
> integers we can use *int* type everywhere.
> # Histogram takes huge amount of time merging values. In merge method largest 
> amount of time taken by finding nearest points. It can be eliminated by 
> holding additional TreeSet with differences, sorted from smalest to greatest.
> # Because we know max size of _bin_ and _differences_ maps we can replace 
> them with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
> insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
> some operations into one.
> # Because spool map is also limited we can replace it with open address 
> primitive map. It's finaly reduce allocation rate to zero.
> You can see gain given by each step in the attached file. First number is 
> time for one benchmark invocation and second - is allocation rate in Mb per 
> operation.
> Dependends of payload time is reduced up to 90%.
> Overall gain:
> |.|.|Payload/SpoolSize|.|.|.|% from original
> |.|.|.|original|.|optimized|
> |.|.|secondInMonth/0|.|.|.|
> |time ms/op|.|.|10747,684|.|5545,063|51,6
> |allocation Mb/op|.|.|2441,38858|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1000|.|.|.|
> |time ms/op|.|.|8988,578|.|5791,179|64,4
> |allocation Mb/op|.|.|2440,951141|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1|.|.|.|
> |time ms/op|.|.|10711,671|.|5765,243|53,8
> |allocation Mb/op|.|.|2437,022537|.|0,264083862|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/10|.|.|.|
> |time ms/op|.|.|13001,841|.|5638,069|43,4
> |allocation Mb/op|.|.|2396,947113|.|2,003662109|0,1
> |.|.|.|.|.|.|
> |.|.|secondInDay/0|.|.|.|
> |time ms/op|.|.|10381,833|.|5497,804|53
> |allocation Mb/op|.|.|2441,166107|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1000|.|.|.|
> |time ms/op|.|.|8522,157|.|5929,871|69,6
> |allocation Mb/op|.|.|1973,112381|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1|.|.|.|
> |time ms/op|.|.|10234,978|.|5480,077|53,5
> |allocation Mb/op|.|.|2306,057404|.|0,262969971|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/10|.|.|.|
> |time ms/op|.|.|2971,178|.|139,079|4,7
> |allocation Mb/op|.|.|172,1276245|.|2,001721191|1,2
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/0|.|.|.|
> |time ms/op|.|.|10663,123|.|5605,672|52,6
> |allocation Mb/op|.|.|2439,456818|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1000|.|.|.|
> |time ms/op|.|.|9029,788|.|5838,618|64,7
> |allocation Mb/op|.|.|2331,839249|.|0,180664063|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1|.|.|.|
> |time ms/op|.|.|4862,409|.|89,001|1,8
> |allocation Mb/op|.|.|965,4871887|.|0,251711652|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/10|.|.|.|
> |time ms/op|.|.|1484,454|.|95,044|6,4
> |allocation Mb/op|.|.|153,2464722|.|2,001712809|1,3
> |.|.|.|.|.|.|
> |.|.|secondInMin/0|.|.|.|
> |time ms/op|.|.|875,118|.|424,11|48,5
> |allocation Mb/op|.|.|610,3554993|.|0,001776123|0
> |.|.|.|.|.|.|
> |.|.|secondInMin/1000|.|.|.|
> |time ms/op|.|.|568,7|.|84,208|14,8
> |allocation Mb/op|.|.|0,007598114|.|0,01810023|238,2
> |.|.|.|.|.|.|
> |.|.|secondInMin/1|.|.|.|
> |time ms/op|.|.|573,595|.|83,862|14,6
> |allocation Mb/op|.|.|0,007597351|.|0,252473872|3323,2
> |.|.|.|.|.|.|
> |.|.|secondInMin/10|.|.|.|
> |time ms/op|

[jira] [Updated] (CASSANDRA-13412) Update of column with TTL results in secondary index not returning row

2017-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrés de la Peña updated CASSANDRA-13412:
--
Fix Version/s: 2.2.x
   2.1.x
Reproduced In: 2.2.9, 2.1.17  (was: 2.2.3, 3.0.11)
   Status: Patch Available  (was: Open)

> Update of column with TTL results in secondary index not returning row
> --
>
> Key: CASSANDRA-13412
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13412
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Enrique Bautista Barahona
>Assignee: Andrés de la Peña
> Fix For: 2.1.x, 2.2.x
>
>
> Cassandra versions: 2.2.3, 3.0.11
> 1 datacenter, keyspace has RF 3. Default consistency level.
> Steps:
> 1. I create these table and index.
> {code}
> CREATE TABLE my_table (
> a text,
> b text,
> c text,
> d set,
> e float,
> f text,
> g int,
> h double,
> j set,
> k float,
> m set,
> PRIMARY KEY (a, b, c)
> ) WITH read_repair_chance = 0.0
>AND dclocal_read_repair_chance = 0.1
>AND gc_grace_seconds = 864000
>AND bloom_filter_fp_chance = 0.01
>AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' }
>AND comment = ''
>AND compaction = { 'class' : 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
>AND compression = { 'sstable_compression' : 
> 'org.apache.cassandra.io.compress.LZ4Compressor' }
>AND default_time_to_live = 0
>AND speculative_retry = '99.0PERCENTILE'
>AND min_index_interval = 128
>AND max_index_interval = 2048;
> CREATE INDEX my_index ON my_table (c);
> {code}
> 2. I have 9951 INSERT statements in a file and I run the following command to 
> execute them. The INSERT statements have no TTL and no consistency level is 
> specified.
> {code}
> cqlsh   -u  -f 
> {code}
> 3. I update a column filtering by the whole primary key, and setting a TTL. 
> For example:
> {code}
> UPDATE my_table USING TTL 30 SET h = 10 WHERE a = 'test_a' AND b = 'test_b' 
> AND c = 'test_c';
> {code}
> 4. After the time specified in the TTL I run the following queries:
> {code}
> SELECT * FROM my_table WHERE a = 'test_a' AND b = 'test_b' AND c = 'test_c';
> SELECT * FROM my_table WHERE c = 'test_c';
> {code}
> The first one returns the correct row with an empty h column (as it has 
> expired). However, the second query (which uses the secondary index on column 
> c) returns nothing.
> I've done the query through my app which uses the Java driver v3.0.4 and 
> reads with CL local_one, from the cql shell and from DBeaver 3.8.5. All 
> display the same behaviour. The queries are performed minutes after the 
> writes and the servers don't have a high load, so I think it's unlikely to be 
> a consistency issue.
> I've tried to reproduce the issue in ccm and cqlsh by creating a new keyspace 
> and table, and inserting just 1 row, and the bug doesn't manifest. This leads 
> me to think that it's an issue only present with not trivially small amounts 
> of data, or maybe present only after Cassandra compacts or performs whatever 
> maintenance it needs to do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13393) Fix weightedSize() for row-cache reported by JMX and NodeTool

2017-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967294#comment-15967294
 ] 

ASF GitHub Bot commented on CASSANDRA-13393:


Github user Fuud closed the pull request at:

https://github.com/apache/cassandra/pull/105


> Fix weightedSize() for row-cache reported by JMX and NodeTool
> -
>
> Key: CASSANDRA-13393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13393
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Fuud
>Assignee: Fuud
>Priority: Minor
>  Labels: lhf
> Fix For: 2.2.10, 3.0.13, 3.11.0, 4.0
>
>
> Row Cache size is reported in entries but should be reported in bytes (as 
> KeyCache do).
> It happens because incorrect OHCProvider.OHCacheAdapter.weightedSize method. 
> Currently it returns cache size but should return ohCache.memUsed()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()

2017-04-13 Thread Corentin Chary (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corentin Chary updated CASSANDRA-13432:
---
Attachment: CASSANDRA-13432.patch

> MemtableReclaimMemory can get stuck because of lack of timeout in 
> getTopLevelColumns()
> --
>
> Key: CASSANDRA-13432
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13432
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra 2.1.15
>Reporter: Corentin Chary
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-13432.patch
>
>
> This might affect 3.x too, I'm not sure.
> {code}
> $ nodetool tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   32135875 0
>  0
> ReadStage   114 0   29492940 0
>  0
> RequestResponseStage  0 0   86090931 0
>  0
> ReadRepairStage   0 0 166645 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> MiscStage 0 0  0 0
>  0
> HintedHandoff 0 0 47 0
>  0
> GossipStage   0 0 188769 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> CommitLogArchiver 0 0  0 0
>  0
> CompactionExecutor0 0  86835 0
>  0
> ValidationExecutor0 0  0 0
>  0
> MigrationStage0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> PendingRangeCalculator0 0 92 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0563 0
>  0
> MemtablePostFlush 0 0   1500 0
>  0
> MemtableReclaimMemory 129534 0
>  0
> Native-Transport-Requests41 0   54819182 0
>   1896
> {code}
> {code}
> "MemtableReclaimMemory:195" - Thread t@6268
>java.lang.Thread.State: WAITING
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
>   at 
> org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "SharedPool-Worker-195" - Thread t@989
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143)
>   at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240)
>   at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483)
>   at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153)
>   at 
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184)
>   at 
> org.apache.cassa

[jira] [Updated] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()

2017-04-13 Thread Corentin Chary (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corentin Chary updated CASSANDRA-13432:
---
Status: Patch Available  (was: Open)

> MemtableReclaimMemory can get stuck because of lack of timeout in 
> getTopLevelColumns()
> --
>
> Key: CASSANDRA-13432
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13432
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra 2.1.15
>Reporter: Corentin Chary
> Fix For: 2.1.x
>
> Attachments: CASSANDRA-13432.patch
>
>
> This might affect 3.x too, I'm not sure.
> {code}
> $ nodetool tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   32135875 0
>  0
> ReadStage   114 0   29492940 0
>  0
> RequestResponseStage  0 0   86090931 0
>  0
> ReadRepairStage   0 0 166645 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> MiscStage 0 0  0 0
>  0
> HintedHandoff 0 0 47 0
>  0
> GossipStage   0 0 188769 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> CommitLogArchiver 0 0  0 0
>  0
> CompactionExecutor0 0  86835 0
>  0
> ValidationExecutor0 0  0 0
>  0
> MigrationStage0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> PendingRangeCalculator0 0 92 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0563 0
>  0
> MemtablePostFlush 0 0   1500 0
>  0
> MemtableReclaimMemory 129534 0
>  0
> Native-Transport-Requests41 0   54819182 0
>   1896
> {code}
> {code}
> "MemtableReclaimMemory:195" - Thread t@6268
>java.lang.Thread.State: WAITING
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
>   at 
> org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "SharedPool-Worker-195" - Thread t@989
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171)
>   at 
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143)
>   at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240)
>   at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483)
>   at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153)
>   at 
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184)
>   at 
> org.apache.ca

[2/3] cassandra git commit: V5 protocol flags decoding broken

2017-04-13 Thread snazy
V5 protocol flags decoding broken

patch by Robert Stupp; reviewed by Stefania for CASSANDRA-13443


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a438d59
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a438d59
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a438d59

Branch: refs/heads/trunk
Commit: 0a438d59e65ee79bca7ffc44b8b958e62448e5c3
Parents: fe8e211
Author: Robert Stupp 
Authored: Thu Apr 13 11:06:29 2017 +0200
Committer: Robert Stupp 
Committed: Thu Apr 13 11:06:29 2017 +0200

--
 CHANGES.txt  | 1 +
 src/java/org/apache/cassandra/cql3/QueryOptions.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a438d59/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7998e10..5516bbd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.0
+ * V5 protocol flags decoding broken (CASSANDRA-13443)
  * Use write lock not read lock for removing sstables from compaction 
strategies. (CASSANDRA-13422)
  * Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors 
(CASSANDRA-13329)
  * Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a438d59/src/java/org/apache/cassandra/cql3/QueryOptions.java
--
diff --git a/src/java/org/apache/cassandra/cql3/QueryOptions.java 
b/src/java/org/apache/cassandra/cql3/QueryOptions.java
index 1ba8f89..f1787a7 100644
--- a/src/java/org/apache/cassandra/cql3/QueryOptions.java
+++ b/src/java/org/apache/cassandra/cql3/QueryOptions.java
@@ -403,7 +403,7 @@ public abstract class QueryOptions
 ConsistencyLevel consistency = CBUtil.readConsistencyLevel(body);
 EnumSet flags = 
Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
? 
(int)body.readUnsignedInt()
-   : (int)body.readByte());
+   : 
(int)body.readUnsignedByte());
 
 List values = Collections.emptyList();
 List names = null;



[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-04-13 Thread snazy
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7da45312
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7da45312
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7da45312

Branch: refs/heads/trunk
Commit: 7da45312aa7a26070d32ca3d153d86f1ea202a19
Parents: 1f53326 0a438d5
Author: Robert Stupp 
Authored: Thu Apr 13 11:06:37 2017 +0200
Committer: Robert Stupp 
Committed: Thu Apr 13 11:06:37 2017 +0200

--
 CHANGES.txt  | 1 +
 src/java/org/apache/cassandra/cql3/QueryOptions.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7da45312/CHANGES.txt
--
diff --cc CHANGES.txt
index dd33fcf,5516bbd..dae275f
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,60 -1,5 +1,61 @@@
 +4.0
 + * Change protocol to allow sending key space independent of query string 
(CASSANDRA-10145)
 + * Make gc_log and gc_warn settable at runtime (CASSANDRA-12661)
 + * Take number of files in L0 in account when estimating remaining compaction 
tasks (CASSANDRA-13354)
 + * Skip building views during base table streams on range movements 
(CASSANDRA-13065)
 + * Improve error messages for +/- operations on maps and tuples 
(CASSANDRA-13197)
 + * Remove deprecated repair JMX APIs (CASSANDRA-11530)
 + * Fix version check to enable streaming keep-alive (CASSANDRA-12929)
 + * Make it possible to monitor an ideal consistency level separate from 
actual consistency level (CASSANDRA-13289)
 + * Outbound TCP connections ignore internode authenticator (CASSANDRA-13324)
 + * Upgrade junit from 4.6 to 4.12 (CASSANDRA-13360)
 + * Cleanup ParentRepairSession after repairs (CASSANDRA-13359)
 + * Incremental repair not streaming correct sstables (CASSANDRA-13328)
 + * Upgrade the jna version to 4.3.0 (CASSANDRA-13300)
 + * Add the currentTimestamp, currentDate, currentTime and currentTimeUUID 
functions (CASSANDRA-13132)
 + * Remove config option index_interval (CASSANDRA-10671)
 + * Reduce lock contention for collection types and serializers 
(CASSANDRA-13271)
 + * Make it possible to override MessagingService.Verb ids (CASSANDRA-13283)
 + * Avoid synchronized on prepareForRepair in ActiveRepairService 
(CASSANDRA-9292)
 + * Adds the ability to use uncompressed chunks in compressed files 
(CASSANDRA-10520)
 + * Don't flush sstables when streaming for incremental repair 
(CASSANDRA-13226)
 + * Remove unused method (CASSANDRA-13227)
 + * Fix minor bugs related to #9143 (CASSANDRA-13217)
 + * Output warning if user increases RF (CASSANDRA-13079)
 + * Remove pre-3.0 streaming compatibility code for 4.0 (CASSANDRA-13081)
 + * Add support for + and - operations on dates (CASSANDRA-11936)
 + * Fix consistency of incrementally repaired data (CASSANDRA-9143)
 + * Increase commitlog version (CASSANDRA-13161)
 + * Make TableMetadata immutable, optimize Schema (CASSANDRA-9425)
 + * Refactor ColumnCondition (CASSANDRA-12981)
 + * Parallelize streaming of different keyspaces (CASSANDRA-4663)
 + * Improved compactions metrics (CASSANDRA-13015)
 + * Speed-up start-up sequence by avoiding un-needed flushes (CASSANDRA-13031)
 + * Use Caffeine (W-TinyLFU) for on-heap caches (CASSANDRA-10855)
 + * Thrift removal (CASSANDRA-5)
 + * Remove pre-3.0 compatibility code for 4.0 (CASSANDRA-12716)
 + * Add column definition kind to dropped columns in schema (CASSANDRA-12705)
 + * Add (automate) Nodetool Documentation (CASSANDRA-12672)
 + * Update bundled cqlsh python driver to 3.7.0 (CASSANDRA-12736)
 + * Reject invalid replication settings when creating or altering a keyspace 
(CASSANDRA-12681)
 + * Clean up the SSTableReader#getScanner API wrt removal of RateLimiter 
(CASSANDRA-12422)
 + * Use new token allocation for non bootstrap case as well (CASSANDRA-13080)
 + * Avoid byte-array copy when key cache is disabled (CASSANDRA-13084)
 + * Require forceful decommission if number of nodes is less than replication 
factor (CASSANDRA-12510)
 + * Allow IN restrictions on column families with collections (CASSANDRA-12654)
 + * Log message size in trace message in OutboundTcpConnection 
(CASSANDRA-13028)
 + * Add timeUnit Days for cassandra-stress (CASSANDRA-13029)
 + * Add mutation size and batch metrics (CASSANDRA-12649)
 + * Add method to get size of endpoints to TokenMetadata (CASSANDRA-12999)
 + * Expose time spent waiting in thread pool queue (CASSANDRA-8398)
 + * Conditionally update index built status to avoid unnecessary flushes 
(CASSANDRA-12969)
 + * cqlsh auto completion: refactor definition of compaction strategy options 
(CASSANDRA-12946)
 + * Add support for arithmetic operators (CASSANDRA-11935)
 + * Add histogram for delay to deli

[1/3] cassandra git commit: V5 protocol flags decoding broken

2017-04-13 Thread snazy
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 fe8e21109 -> 0a438d59e
  refs/heads/trunk 1f533260a -> 7da45312a


V5 protocol flags decoding broken

patch by Robert Stupp; reviewed by Stefania for CASSANDRA-13443


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0a438d59
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0a438d59
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0a438d59

Branch: refs/heads/cassandra-3.11
Commit: 0a438d59e65ee79bca7ffc44b8b958e62448e5c3
Parents: fe8e211
Author: Robert Stupp 
Authored: Thu Apr 13 11:06:29 2017 +0200
Committer: Robert Stupp 
Committed: Thu Apr 13 11:06:29 2017 +0200

--
 CHANGES.txt  | 1 +
 src/java/org/apache/cassandra/cql3/QueryOptions.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a438d59/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7998e10..5516bbd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.0
+ * V5 protocol flags decoding broken (CASSANDRA-13443)
  * Use write lock not read lock for removing sstables from compaction 
strategies. (CASSANDRA-13422)
  * Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors 
(CASSANDRA-13329)
  * Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0a438d59/src/java/org/apache/cassandra/cql3/QueryOptions.java
--
diff --git a/src/java/org/apache/cassandra/cql3/QueryOptions.java 
b/src/java/org/apache/cassandra/cql3/QueryOptions.java
index 1ba8f89..f1787a7 100644
--- a/src/java/org/apache/cassandra/cql3/QueryOptions.java
+++ b/src/java/org/apache/cassandra/cql3/QueryOptions.java
@@ -403,7 +403,7 @@ public abstract class QueryOptions
 ConsistencyLevel consistency = CBUtil.readConsistencyLevel(body);
 EnumSet flags = 
Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
? 
(int)body.readUnsignedInt()
-   : (int)body.readByte());
+   : 
(int)body.readUnsignedByte());
 
 List values = Collections.emptyList();
 List names = null;



[jira] [Updated] (CASSANDRA-13443) V5 protocol flags decoding broken

2017-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-13443:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you!

Committed as 
[0a438d59e65ee79bca7ffc44b8b958e62448e5c3|https://github.com/apache/cassandra/commit/0a438d59e65ee79bca7ffc44b8b958e62448e5c3]
 to [cassandra-3.11|https://github.com/apache/cassandra/tree/cassandra-3.11]


> V5 protocol flags decoding broken
> -
>
> Key: CASSANDRA-13443
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13443
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.11.0, 4.0
>
>
> Since native protocol version 5 we deserialize the flags in 
> {{org.apache.cassandra.cql3.QueryOptions.Codec#decode}} as follows:
> {code}
> EnumSet flags = 
> Flag.deserialize(version.isGreaterOrEqualTo(ProtocolVersion.V5)
>? 
> (int)body.readUnsignedInt()
>: (int)body.readByte());
> {code}
> This works until the highest bit (0x80) is not used. {{readByte}} must be 
> changed to {{readUnsignedByte}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Roland Otta (JIRA)
Roland Otta created CASSANDRA-13445:
---

 Summary: validation executor thread is stuck
 Key: CASSANDRA-13445
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13445
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
 Environment: cassandra 3.10
Reporter: Roland Otta


we have the following issue on our 3.10 development cluster.

sometimes the repairs (it is a full repair in that case) hang because
of a stuck validation compaction.

nodetool compactionstats says 
a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
805955242 841258085 bytes 95.80% 
and there is no more progress at this percentage.

i checked the logs on the affected node and could not find any
suspicious errors.

a thread dump shows that the validation executor threads is always repeating 
stuff in 
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)

here is the full stack trace

{noformat}
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterato

[jira] [Commented] (CASSANDRA-13257) Add repair streaming preview

2017-04-13 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967447#comment-15967447
 ] 

Marcus Eriksson commented on CASSANDRA-13257:
-

code LGTM, just a few small comments;
* {{\-p}} (short for {{--preview}}) clashes with {{-p}} for port
* logging - we should make it clear that we are doing a preview repair, perhaps 
replace the prefix {{\[repair #{}\] ...}} with {{\[preview repair #{}\]}}?
* Use {{FBUtilities.prettyPrintMemory}} when displaying the result?
* Seems we still insert into {{system_distributed.repair_history}} in a few 
places during a preview, we should probably avoid that
* Log the result as well as outputting it to stdout - if we ctrl+c the command 
we could still read the result
* A few dtests running these new commands so we don't break them in the future

> Add repair streaming preview
> 
>
> Key: CASSANDRA-13257
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Streaming and Messaging
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that 
> needs to be done, without actually doing any streaming. Our main motivation 
> for this having something this is validating CASSANDRA-9143 in production, 
> but I’d imagine it could also be a useful tool in troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


cassandra git commit: Pending repair info was added in 4.0

2017-04-13 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 7da45312a -> 02ded0194


Pending repair info was added in 4.0

Patch by marcuse; reviewed by Blake Eggleston for CASSANDRA-13420


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/02ded019
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/02ded019
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/02ded019

Branch: refs/heads/trunk
Commit: 02ded019475efdd1331ac908c33dd05d42ba9368
Parents: 7da4531
Author: Marcus Eriksson 
Authored: Wed Apr 5 09:29:07 2017 +0200
Committer: Marcus Eriksson 
Committed: Thu Apr 13 13:38:49 2017 +0200

--
 .../org/apache/cassandra/io/sstable/format/big/BigFormat.java   | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/02ded019/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java 
b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
index 9b0b5c5..a58b201 100644
--- a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
@@ -119,9 +119,8 @@ public class BigFormat implements SSTableFormat
 // store rows natively
 // mb (3.0.7, 3.7): commit log lower bound included
 // mc (3.0.8, 3.9): commit log intervals included
-// md (3.0.9, 3.10): pending repair session included
 
-// na (4.0.0): uncompressed chunks
+// na (4.0.0): uncompressed chunks, pending repair session
 //
 // NOTE: when adding a new version, please add that to 
LegacySSTableTest, too.
 
@@ -142,7 +141,7 @@ public class BigFormat implements SSTableFormat
 hasCommitLogLowerBound = version.compareTo("mb") >= 0;
 hasCommitLogIntervals = version.compareTo("mc") >= 0;
 hasMaxCompressedLength = version.compareTo("na") >= 0;
-hasPendingRepair = version.compareTo("md") >= 0;
+hasPendingRepair = version.compareTo("na") >= 0;
 }
 
 @Override



[jira] [Updated] (CASSANDRA-13420) Pending repair info was added in 4.0

2017-04-13 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13420:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

> Pending repair info was added in 4.0
> 
>
> Key: CASSANDRA-13420
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13420
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
>
> Pending repair information was actually added in 4.0
> https://github.com/krummas/cassandra/commits/marcuse/pendingrepairversion



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13420) Pending repair info was added in 4.0

2017-04-13 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967459#comment-15967459
 ] 

Marcus Eriksson commented on CASSANDRA-13420:
-

and committed, thanks

> Pending repair info was added in 4.0
> 
>
> Key: CASSANDRA-13420
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13420
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
>
> Pending repair information was actually added in 4.0
> https://github.com/krummas/cassandra/commits/marcuse/pendingrepairversion



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967484#comment-15967484
 ] 

Christian Esken commented on CASSANDRA-13265:
-

There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes to  all branches where in the appropriate 
versions. Please check, as the naming conventions within Cassandra are still 
not clear to me(e.g. there exists a 3.11 branch, a 3.0.11 release and a 3.11.0 
changelog entry).

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967484#comment-15967484
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/13/17 11:59 AM:
---

There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes in the "matching" release versions that were 
listed in the individual branches. Please check, as the naming conventions 
within Cassandra are still not clear to me (e.g. there exists a 3.11 branch, a 
3.0.11 release and a 3.11.0 changelog entry).


was (Author: cesken):
There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes to  all branches where in the appropriate 
versions. Please check, as the naming conventions within Cassandra are still 
not clear to me(e.g. there exists a 3.11 branch, a 3.0.11 release and a 3.11.0 
changelog entry).

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967513#comment-15967513
 ] 

Jason Brown commented on CASSANDRA-8457:


[~aweisberg] and I spoke with Scott Mitchell of the netty team and we have a 
potential solution better than what I put in the last comment. The TL;DR is to 
add an iterator to netty's {{ChannelOutboundBuffer}} and walk the "queue" 
looking for expired items. It requires a patch netty (which I'm working on), 
but should be straight forward. More details soon...

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967568#comment-15967568
 ] 

Sylvain Lebresne commented on CASSANDRA-8457:
-

bq. I think it's important that a single slow node or network issue resulting 
in a socket that isn't writable shouldn't allow an arbitrary amount of data to 
collect on the heap. Right now there is nothing that can drop the data in that 
scenario.

I don't necessarily disagree on that somewhat general statement, but I'm far 
from convinced that checking for expired message is the right tool for the job 
in the first place. The fact is that expiration is time-based, that default 
timeouts are in multiple of seconds, so plenty of time for message to 
accumulate and blow the heap without having any of them being droppable. On top 
of that, not all message have timeouts, which actually make sense because 
message timeout isn't a back-pressure mechanism, it's about how long we're 
willing to wait for an answer to a request message, and hence one-way message 
have no reason to have such timeout. And that's part of the point, I dislike 
using a concept that isn't meant to be related to back-pressure to do 
back-pressure, especially when it's as flawed as this one. Users shouldn't have 
to worry that nodes could OOM because they put writes timeout high, it's just 
not intuitive.

Don't get me wrong, I don't disagree that some back-pressure mechanism should 
be added for that problem, but that should be more based on the amount of 
message data (or, at the very least the number of such messages) in the Netty 
queue. Surely we're not the only one facing this problem though, doesn't Netty 
already have a standard way to deal with that problem of messages piling up in 
its queues?

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967586#comment-15967586
 ] 

Jason Brown commented on CASSANDRA-8457:


bq. doesn't Netty already have a standard way to deal with that problem of 
messages piling up in its queues?

Netty has a high/low water mark mechanism that looks at the number of bytes in 
the channel and sends a "writablility changed" event through channel once one 
of those thresholds has been reached. I'm currently using that feature to know 
when we've hit a decent amount of buffered data before we explicitly call flush.

We could expand this to say "if there's greater than  number of 
bytes in the channel, drop older messages"

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-8457:
---
Comment: was deleted

(was: bq. doesn't Netty already have a standard way to deal with that problem 
of messages piling up in its queues?

Netty has a high/low water mark mechanism that looks at the number of bytes in 
the channel and sends a "writablility changed" event through channel once one 
of those thresholds has been reached. I'm currently using that feature to know 
when we've hit a decent amount of buffered data before we explicitly call flush.

We could expand this to say "if there's greater than  number of 
bytes in the channel, drop older messages")

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13446) CQLSSTableWriter takes 100% CPU when the buffer_size_in_mb is larger than 64MB

2017-04-13 Thread xiangdong Huang (JIRA)
xiangdong Huang created CASSANDRA-13446:
---

 Summary: CQLSSTableWriter takes 100% CPU when the 
buffer_size_in_mb is larger than 64MB 
 Key: CASSANDRA-13446
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13446
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Windows 10, 8GB memory, i7 CPU
Reporter: xiangdong Huang
 Attachments: csv2sstable.java, pom.xml, test.csv

I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```java
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and!! NO SSTABLE IS WRITTEN. 
Windows task manager shows the disk I/O for this process is 0.0 MB/s.  There is 
no file appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ 
and a _zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes 
and disappears..). At this time, the process spends 99% CPU! and the memory is 
a little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachment is my test program and the csv data. 
A complete test program can be found from: 
https://bitbucket.org/jixuan1989/csv2sstable


 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13446) CQLSSTableWriter takes 100% CPU when the buffer_size_in_mb is larger than 64MB

2017-04-13 Thread xiangdong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangdong Huang updated CASSANDRA-13446:

Description: 
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and!! NO SSTABLE IS WRITTEN. 
Windows task manager shows the disk I/O for this process is 0.0 MB/s.  There is 
no file appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ 
and a _zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes 
and disappears..). At this time, the process spends 99% CPU! and the memory is 
a little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachment is my test program and the csv data. 
A complete test program can be found from: 
https://bitbucket.org/jixuan1989/csv2sstable


 

  was:
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```java
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and!! NO SSTABLE IS WRITTEN. 
Windows task manager shows the disk I/O for this process is 0.0 MB/s.  There is 
no file appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ 
and a _zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes 
and disappears..). At this time, the process spends 99% CPU! and the memory is 
a little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The at

[jira] [Updated] (CASSANDRA-13446) CQLSSTableWriter takes 100% CPU when the buffer_size_in_mb is larger than 64MB

2017-04-13 Thread xiangdong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangdong Huang updated CASSANDRA-13446:

Description: 
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and NO SSTABLE IS WRITTEN. Windows 
task manager shows the disk I/O for this process is 0.0 MB/s.  There is no file 
appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ and a 
_zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes and 
disappears..). At this time, the process spends 99% CPU! and the memory is a 
little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachment is my test program and the csv data. 
A complete test program can be found from: 
https://bitbucket.org/jixuan1989/csv2sstable


 

  was:
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and!! NO SSTABLE IS WRITTEN. 
Windows task manager shows the disk I/O for this process is 0.0 MB/s.  There is 
no file appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ 
and a _zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes 
and disappears..). At this time, the process spends 99% CPU! and the memory is 
a little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachme

[jira] [Updated] (CASSANDRA-13446) CQLSSTableWriter takes 100% CPU when the buffer_size_in_mb is larger than 64MB

2017-04-13 Thread xiangdong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangdong Huang updated CASSANDRA-13446:

Description: 
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and NO SSTABLE IS WRITTEN. Windows 
task manager shows the disk I/O for this process is 0.0 MB/s.  There is no file 
appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ and a 
_zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes and 
disappears..). At this time, the process spends 99% CPU! and the memory is a 
little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU, other threads (Thread-1 and 
ScheduledTasks are waiting...)

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachment is my test program and the csv data. 
A complete test program can be found from: 
https://bitbucket.org/jixuan1989/csv2sstable


 

  was:
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and NO SSTABLE IS WRITTEN. Windows 
task manager shows the disk I/O for this process is 0.0 MB/s.  There is no file 
appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ and a 
_zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes and 
disappears..). At this time, the process spends 99% CPU! and the memory is a 
little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967621#comment-15967621
 ] 

Jason Brown commented on CASSANDRA-8457:


bq. doesn't Netty already have a standard way to deal with that problem of 
messages piling up in its queues?

Netty has a high/low water mark mechanism that looks at the number of bytes in 
the channel and sends a "writablility changed" event through channel once one 
of those thresholds has been reached. I'm currently using that feature in 
{{MessageOutHandler#channelWritabilityChanged()}} to know when we've hit a 
decent amount of buffered data before we explicitly call flush. Beyond this, I 
do not think netty has any other explicit back pressure mechanism built in 
(like a handler or something similar).

We could expand our use of the high/low water mark to say "if there's greater 
than  number of bytes in the channel, drop 'some' messages". If 
we want to drop older messages for which we feel the client has (or reasonably 
will) timed out, we'll have to do something like what I've proposed in my most 
recent comments (and which I'm working on right now). This message expiration 
behavior can then become not only about timeout (I still think there's 
reasonable use in that), but also protects size of data in the channel. 

One other thing we can do is can bound the number of tasks that can be queued 
in to the channel 
([{{SingleThreadEventLoop#DEFAULT_MAX_PENDING_TASKS}}|https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/SingleThreadEventLoop.java#L35]).
 I quickly traced the netty code, and I *think* a 
{{RejectedExecutionException}} is thrown when you try to add a message to 
channel which is filled to it's capacity. I'm not sure we want to use this as 
the only backpressure mechanism, but, as unbounded queues are awful (the 
default queue size is {{Integer.MAX_VALUE}}), it might not be a bad idea to 
bound this to at least something sane (16k-32k as an upper bound can't be 
unreasonable for a single channel). This, of course, would expect us to be more 
resilient to dropped messages on the enqueuing side, which is probably a good 
idea anyway.

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13066) Fast streaming with materialized views

2017-04-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967642#comment-15967642
 ] 

Paulo Motta commented on CASSANDRA-13066:
-

While it may make sense to pursue this optimization, I'm not sure adding a 
{{mv_fast_stream}} option is the best way to expose this to general usage for 
the following reasons:
a) It has a limited scope requiring users to know streaming internals of MVs to 
enable it, so it's not very friendly.
b) It has has a significant foot-shooting potential, when users enable this and 
perform partial writes or updates to existing rows, so users may enable it 
thinking fast=good without thinking of the consequences.

It basically boils down to this Sylvain's comment on CASSANDRA-9779:

bq. It seems clear to me that this will add complexity from the user point of 
view (it's a new concept that will either have good footshooting potential (if 
we were to just trust the user to insert only without checking it) and be 
annoying to use (if we force all columns every time)), so it sounds to me like 
we would need to demonstrate fairly big performance benefits to be worth doing 
(keep in mind that once we add such thing, we can't easily remove it, even if 
the improvement become obsolete).

With this said, since this would only be applicable to append-only MVs so I'd 
be more in favor of providing the whole feature set of append-only MVs instead 
which would include this and other optimizations (such as skipping 
read-before-write) and also enforce the append-only contract defined on MV 
creation, being much safer and having a more well defined semantics to users.

> Fast streaming with materialized views
> --
>
> Key: CASSANDRA-13066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13066
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
> Fix For: 4.0
>
>
> I propose adding a configuration option to send streams of tables with MVs 
> not through the regular write path.
> This may be either a global option or better a CF option.
> Background:
> A repair of a CF with an MV that is much out of sync creates many streams. 
> These streams all go through the regular write path to assert local 
> consistency of the MV. This again causes a read before write for every single 
> mutation which again puts a lot of pressure on the node - much more than 
> simply streaming the SSTable down.
> In some cases this can be avoided. Instead of only repairing the base table, 
> all base + mv tables would have to be repaired. But this can break eventual 
> consistency between base table and MV. The proposed behaviour is always safe, 
> when having append-only MVs. It also works when using CL_QUORUM writes but it 
> cannot be absolutely guaranteed, that a quorum write is applied atomically, 
> so this can also lead to inconsistencies, if a quorum write is started but 
> one node dies in the middle of a request.
> So, this proposal can help a lot in some situations but also can break 
> consistency in others. That's why it should be left upon the operator if that 
> behaviour is appropriate for individual use cases.
> This issue came up here:
> https://issues.apache.org/jira/browse/CASSANDRA-12888?focusedCommentId=15736599&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15736599



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13066) Fast streaming with materialized views

2017-04-13 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-13066:

Status: Open  (was: Patch Available)

> Fast streaming with materialized views
> --
>
> Key: CASSANDRA-13066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13066
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
> Fix For: 4.0
>
>
> I propose adding a configuration option to send streams of tables with MVs 
> not through the regular write path.
> This may be either a global option or better a CF option.
> Background:
> A repair of a CF with an MV that is much out of sync creates many streams. 
> These streams all go through the regular write path to assert local 
> consistency of the MV. This again causes a read before write for every single 
> mutation which again puts a lot of pressure on the node - much more than 
> simply streaming the SSTable down.
> In some cases this can be avoided. Instead of only repairing the base table, 
> all base + mv tables would have to be repaired. But this can break eventual 
> consistency between base table and MV. The proposed behaviour is always safe, 
> when having append-only MVs. It also works when using CL_QUORUM writes but it 
> cannot be absolutely guaranteed, that a quorum write is applied atomically, 
> so this can also lead to inconsistencies, if a quorum write is started but 
> one node dies in the middle of a request.
> So, this proposal can help a lot in some situations but also can break 
> consistency in others. That's why it should be left upon the operator if that 
> behaviour is appropriate for individual use cases.
> This issue came up here:
> https://issues.apache.org/jira/browse/CASSANDRA-12888?focusedCommentId=15736599&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15736599



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13066) Fast streaming with materialized views

2017-04-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967642#comment-15967642
 ] 

Paulo Motta edited comment on CASSANDRA-13066 at 4/13/17 2:28 PM:
--

While it may make sense to pursue this optimization, I'm not sure adding a 
{{mv_fast_stream}} option is the best way to expose this to general usage for 
the following reasons:
a) It has a limited scope requiring users to know streaming internals of MVs to 
enable it, so it's not very friendly.
b) It has has a significant foot-shooting potential, when users enable this and 
perform partial writes or updates to existing rows, so users may enable it 
thinking fast=good without thinking of the consequences.

It basically boils down to this Sylvain's comment on CASSANDRA-9779:

bq. It seems clear to me that this will add complexity from the user point of 
view (it's a new concept that will either have good footshooting potential (if 
we were to just trust the user to insert only without checking it) and be 
annoying to use (if we force all columns every time)), so it sounds to me like 
we would need to demonstrate fairly big performance benefits to be worth doing 
(keep in mind that once we add such thing, we can't easily remove it, even if 
the improvement become obsolete).

With this said, since this would only be applicable to append-only MVs I'd be 
more in favor of providing the whole feature set of append-only MVs instead 
which would include this and other optimizations (such as skipping 
read-before-write) and also enforce the append-only contract defined on MV 
creation, being much safer and having a more well defined semantics to users.


was (Author: pauloricardomg):
While it may make sense to pursue this optimization, I'm not sure adding a 
{{mv_fast_stream}} option is the best way to expose this to general usage for 
the following reasons:
a) It has a limited scope requiring users to know streaming internals of MVs to 
enable it, so it's not very friendly.
b) It has has a significant foot-shooting potential, when users enable this and 
perform partial writes or updates to existing rows, so users may enable it 
thinking fast=good without thinking of the consequences.

It basically boils down to this Sylvain's comment on CASSANDRA-9779:

bq. It seems clear to me that this will add complexity from the user point of 
view (it's a new concept that will either have good footshooting potential (if 
we were to just trust the user to insert only without checking it) and be 
annoying to use (if we force all columns every time)), so it sounds to me like 
we would need to demonstrate fairly big performance benefits to be worth doing 
(keep in mind that once we add such thing, we can't easily remove it, even if 
the improvement become obsolete).

With this said, since this would only be applicable to append-only MVs so I'd 
be more in favor of providing the whole feature set of append-only MVs instead 
which would include this and other optimizations (such as skipping 
read-before-write) and also enforce the append-only contract defined on MV 
creation, being much safer and having a more well defined semantics to users.

> Fast streaming with materialized views
> --
>
> Key: CASSANDRA-13066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13066
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
> Fix For: 4.0
>
>
> I propose adding a configuration option to send streams of tables with MVs 
> not through the regular write path.
> This may be either a global option or better a CF option.
> Background:
> A repair of a CF with an MV that is much out of sync creates many streams. 
> These streams all go through the regular write path to assert local 
> consistency of the MV. This again causes a read before write for every single 
> mutation which again puts a lot of pressure on the node - much more than 
> simply streaming the SSTable down.
> In some cases this can be avoided. Instead of only repairing the base table, 
> all base + mv tables would have to be repaired. But this can break eventual 
> consistency between base table and MV. The proposed behaviour is always safe, 
> when having append-only MVs. It also works when using CL_QUORUM writes but it 
> cannot be absolutely guaranteed, that a quorum write is applied atomically, 
> so this can also lead to inconsistencies, if a quorum write is started but 
> one node dies in the middle of a request.
> So, this proposal can help a lot in some situations but also can break 
> consistency in others. That's why it should be left upon the operator if that 
> behaviour is appropriate for individual use cases.
> This issue came up here:
> https://issues.apache.org/jira/browse/CASSANDRA-

[jira] [Updated] (CASSANDRA-13446) CQLSSTableWriter takes 100% CPU when the buffer_size_in_mb is larger than 64MB

2017-04-13 Thread xiangdong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangdong Huang updated CASSANDRA-13446:

Description: 
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and NO SSTABLE IS WRITTEN. Windows 
task manager shows the disk I/O for this process is 0.0 MB/s.  There is no file 
appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ and a 
_zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes and 
disappears..). At this time, the process spends 99% CPU! and the memory is a 
little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU, other threads (Thread-1 and 
ScheduledTasks are waiting...)

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data in the batch load process. 

Now I wonder whether Spark and MapReduce work well with Cassandra, because when 
I have a glance of the source code, I notice that they also use 
CQLSSTableWriter to save output data

The  cassandra version is 3.10. The datastax driver (for typec) is 3.2.0.

The attachment is my test program and the csv data. 
A complete test program can be found from: 
https://bitbucket.org/jixuan1989/csv2sstable

Update:
I find a similar issue on stackoverflow but no a good solution:
 
http://stackoverflow.com/questions/28506947/loading-large-row-data-into-cassandra-using-java-and-cqlsstablewriter

  was:
I want to use CQLSSTableWriter to load large amounts of data as SSTables, 
however the CPU cost and the speed is not good.
```
CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(new File("output"+j))
.forTable(SCHEMA)

.withBufferSizeInMB(Integer.parseInt(System.getProperty("buffer_size_in_mb", 
"256")))//FIXME!! if the size is 64, it is ok, if it is 128 or larger, boom!!
.using(INSERT_STMT)
.withPartitioner(new Murmur3Partitioner()).build();
```
if the `buffer_size_in_mb` is less than 64MB in my  PC, everything is ok: the 
CPU utilization is about 60% and the memory is about 3GB (why 3GB? Luckly, I 
can bear that...).  The process creates 24MB per sstable (I think it is because 
sstable compresses data) one by one.

However, if the `buffer_size_in_mb` is greater, e.g., 128MB on my PC,  The CPU 
utilization is about 70%, the memory is still about 3GB.
When the CQLSSTableWriter receives 128MB data, it begins to flush data as a 
sstable. At this time, the bad thing comes:
CQLSSTableWriter.addRow() becomes very slow, and NO SSTABLE IS WRITTEN. Windows 
task manager shows the disk I/O for this process is 0.0 MB/s.  There is no file 
appears in the output folder (Sometimes a _zero-KB mc-1-big-Data.db_ and a 
_zero-KB mc-1-big-Index.db_ appear, and some transaction log file comes and 
disappears..). At this time, the process spends 99% CPU! and the memory is a 
little larger than 3GB
Long long time later, the process crashes because of "GC overhead...", and 
there is still no sstable file built.

When I use jprofile 10 to check who uses so much CPU, it says 
CQLSSTableWriter.addRow() takes about 99% CPU, other threads (Thread-1 and 
ScheduledTasks are waiting...)

I have no idea to optimize the process, because Cassandra's SStable writing 
process is so complex...

The important thing is, 64MB buffer size is too small in production 
environments: it creates many 24MB SSTables, but we want a large sstable which 
can hold all the data

[jira] [Commented] (CASSANDRA-13444) Fast and garbage-free Streaming Histogram

2017-04-13 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967726#comment-15967726
 ] 

Jeff Jirsa commented on CASSANDRA-13444:


Description looks great - will try to review in very near future

> Fast and garbage-free Streaming Histogram
> -
>
> Key: CASSANDRA-13444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13444
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Fuud
> Attachments: results.csv, results.xlsx
>
>
> StreamingHistogram is cause of high cpu usage and GC pressure.
> It was improved at CASSANDRA-13038 by introducing intermediate buffer to try 
> accumulate different values into the big map before merging them into smaller 
> one.
> But there was not enought for TTL's distributed within large time. Rounding 
> (also introduced at 13038) can help but it reduce histogram precision 
> specially in case where TTL's does not distributed uniformly.
> There are several improvements that can help to reduce cpu and gc usage. Them 
> all included in the pool-request as separate revisions thus you can test them 
> independently.
> Improvements list:
> # Use Map.computeIfAbsent instead of get->checkIfNull->put chain. In this way 
> "add-or-accumulate" operation takes one map operation instead of two. But 
> this method (default-defined in interface Map) is overriden in HashMap but 
> not in TreeMap. Thus I changed spool type to HashMap.
> # As we round incoming values to _roundSeconds_ we can also round value on 
> merge. It will enlarge hit rate for bin operations.
> # Because we inserted only integers into Histogram and rounding values to 
> integers we can use *int* type everywhere.
> # Histogram takes huge amount of time merging values. In merge method largest 
> amount of time taken by finding nearest points. It can be eliminated by 
> holding additional TreeSet with differences, sorted from smalest to greatest.
> # Because we know max size of _bin_ and _differences_ maps we can replace 
> them with sorted arrays. Search can be done with _Arrays.binarySearch_ and 
> insertion/deletions can be done by _System.arraycopy_. Also it helps to merge 
> some operations into one.
> # Because spool map is also limited we can replace it with open address 
> primitive map. It's finaly reduce allocation rate to zero.
> You can see gain given by each step in the attached file. First number is 
> time for one benchmark invocation and second - is allocation rate in Mb per 
> operation.
> Dependends of payload time is reduced up to 90%.
> Overall gain:
> |.|.|Payload/SpoolSize|.|.|.|% from original
> |.|.|.|original|.|optimized|
> |.|.|secondInMonth/0|.|.|.|
> |time ms/op|.|.|10747,684|.|5545,063|51,6
> |allocation Mb/op|.|.|2441,38858|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1000|.|.|.|
> |time ms/op|.|.|8988,578|.|5791,179|64,4
> |allocation Mb/op|.|.|2440,951141|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/1|.|.|.|
> |time ms/op|.|.|10711,671|.|5765,243|53,8
> |allocation Mb/op|.|.|2437,022537|.|0,264083862|0
> |.|.|.|.|.|.|
> |.|.|secondInMonth/10|.|.|.|
> |time ms/op|.|.|13001,841|.|5638,069|43,4
> |allocation Mb/op|.|.|2396,947113|.|2,003662109|0,1
> |.|.|.|.|.|.|
> |.|.|secondInDay/0|.|.|.|
> |time ms/op|.|.|10381,833|.|5497,804|53
> |allocation Mb/op|.|.|2441,166107|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1000|.|.|.|
> |time ms/op|.|.|8522,157|.|5929,871|69,6
> |allocation Mb/op|.|.|1973,112381|.|0,017715454|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/1|.|.|.|
> |time ms/op|.|.|10234,978|.|5480,077|53,5
> |allocation Mb/op|.|.|2306,057404|.|0,262969971|0
> |.|.|.|.|.|.|
> |.|.|secondInDay/10|.|.|.|
> |time ms/op|.|.|2971,178|.|139,079|4,7
> |allocation Mb/op|.|.|172,1276245|.|2,001721191|1,2
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/0|.|.|.|
> |time ms/op|.|.|10663,123|.|5605,672|52,6
> |allocation Mb/op|.|.|2439,456818|.|0,002105713|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1000|.|.|.|
> |time ms/op|.|.|9029,788|.|5838,618|64,7
> |allocation Mb/op|.|.|2331,839249|.|0,180664063|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/1|.|.|.|
> |time ms/op|.|.|4862,409|.|89,001|1,8
> |allocation Mb/op|.|.|965,4871887|.|0,251711652|0
> |.|.|.|.|.|.|
> |.|.|secondIn3Hour/10|.|.|.|
> |time ms/op|.|.|1484,454|.|95,044|6,4
> |allocation Mb/op|.|.|153,2464722|.|2,001712809|1,3
> |.|.|.|.|.|.|
> |.|.|secondInMin/0|.|.|.|
> |time ms/op|.|.|875,118|.|424,11|48,5
> |allocation Mb/op|.|.|610,3554993|.|0,001776123|0
> |.|.|.|.|.|.|
> |.|.|secondInMin/1000|.|.|.|
> |time ms/op|.|.|568,7|.|84,208|14,8
> |allocation Mb/op|.|.|0,007598114|.|0,01810023|238,2
> |.|.|.|.|.|.|
> |.|.|secondInMin/1|.|.|.|
> |time ms/op|.|.|573,595|.|83,862|14,6
> |allocation Mb/op|.|.|0,007597351|.|0,252473872|3323,2
> |.|.|.|.

[jira] [Commented] (CASSANDRA-13431) Streaming error occurred org.apache.cassandra.io.FSReadError: java.io.IOException: Broken pipe

2017-04-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967768#comment-15967768
 ] 

Paulo Motta commented on CASSANDRA-13431:
-

broken pipe means the other end closed the connection, what was the streaming 
error on {{123.120.56.71}} ?

> Streaming error occurred org.apache.cassandra.io.FSReadError: 
> java.io.IOException: Broken pipe
> --
>
> Key: CASSANDRA-13431
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13431
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: ubuntu, cassandra 2.2.7, AWS EC2
>Reporter: krish
>  Labels: features, patch, performance
> Fix For: 2.2.7
>
>
> I am trying to add a node to the cluster. 
> Adding new node to cluster fails with broken pipe. cassandra fails after 
> starting with in 2 mints. 
> removed the node from the ring. Adding back fails. 
> OS info:  4.4.0-59-generic #80-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux.
> ERROR [STREAM-OUT-/123.120.56.71] 2017-04-10 23:46:15,410 
> StreamSession.java:532 - [Stream #cbb7a150-1e47-11e7-a556-a98ec456f4de] 
> Streaming error occurred
> org.apache.cassandra.io.FSReadError: java.io.IOException: Broken pipe
> at 
> org.apache.cassandra.io.util.ChannelProxy.transferTo(ChannelProxy.java:144) 
> ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.compress.CompressedStreamWriter$1.apply(CompressedStreamWriter.java:91)
>  ~[apache-cassandra-2.2.7.jar:2.2.
>   7]
> at 
> org.apache.cassandra.streaming.compress.CompressedStreamWriter$1.apply(CompressedStreamWriter.java:88)
>  ~[apache-cassandra-2.2.7.jar:2.2.
>   7]
> at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.applyToChannel(BufferedDataOutputStreamPlus.java:297)
>  ~[apache-cassandra-2.2.7  
> .jar:2.2.7]
> at 
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:87)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:90)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:48)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:40)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:47)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:389)
>  ~[apache-cassandra-2.2.7  
> .jar:2.2.7]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:361)
>  ~[apache-cassandra-2.2.7.jar:2.2.7]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> Caused by: java.io.IOException: Broken pipe
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) 
> ~[na:1.8.0_101]
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
>  ~[na:1.8.0_101]
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493) 
> ~[na:1.8.0_101]
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608) 
> ~[na:1.8.0_101]
> at 
> org.apache.cassandra.io.util.ChannelProxy.transferTo(ChannelProxy.java:140) 
> ~[apache-cassandra-2.2.7.jar:2.2.7]
> ... 11 common frames omitted
> INFO  [STREAM-OUT-/123.120.56.71] 2017-04-10 23:46:15,424 
> StreamResultFuture.java:183 - [Stream #cbb7a150-1e47-11e7-a556-a98ec456f4de] 
> Session with /  123.120.56.71 
> is complete
> WARN  [STREAM-OUT-/123.120.56.71] 2017-04-10 23:46:15,425 
> StreamResultFuture.java:210 - [Stream #cbb7a150-1e47-11e7-a556-a98ec456f4de] 
> Stream failed



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967800#comment-15967800
 ] 

Paulo Motta commented on CASSANDRA-13445:
-

cc [~barnie] since this seem to be related with CASSANDRA-5863

> validation executor thread is stuck
> ---
>
> Key: CASSANDRA-13445
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13445
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cassandra 3.10
>Reporter: Roland Otta
>
> we have the following issue on our 3.10 development cluster.
> sometimes the repairs (it is a full repair in that case) hang because
> of a stuck validation compaction.
> nodetool compactionstats says 
> a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80% 
> and there is no more progress at this percentage.
> i checked the logs on the affected node and could not find any
> suspicious errors.
> a thread dump shows that the validation executor threads is always repeating 
> stuff in 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> here is the full stack trace
> {noformat}
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
>  Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
>  Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(M

[jira] [Commented] (CASSANDRA-10145) Change protocol to allow sending key space independent of query string

2017-04-13 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967851#comment-15967851
 ] 

Jeff Jirsa commented on CASSANDRA-10145:


[~snazy] [~sandman] - I deleted my comment because I hit submit before I had 
finished reading the code. You can still find it if you click the 'all' link in 
JIRA, but it's not a big deal. I think the points are valid, but they're not 
critical. The style issue is objectively wrong, but it's fairly minor in the 
big scheme of things - could be ninja'd certainly. The other two (ctor/ternary) 
are just opinions that can be ignored. 



> Change protocol to allow sending key space independent of query string
> --
>
> Key: CASSANDRA-10145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10145
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Sandeep Tamhankar
>  Labels: client-impacting, protocolv5
> Fix For: 4.0
>
> Attachments: 10145-trunk.txt
>
>
> Currently keyspace is either embedded in the query string or set through "use 
> keyspace" on a connection by client driver. 
> There are practical use cases where client user has query and keyspace 
> independently. In order for that scenario to work, they will have to create 
> one client session per keyspace or have to resort to some string replace 
> hackery.
> It will be nice if protocol allowed sending keyspace separately from the 
> query. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967857#comment-15967857
 ] 

Ariel Weisberg commented on CASSANDRA-13265:


OK, don't forget to get set up with CircleCI and post the block with the test 
results.

Also you transposed 13625 and 13265 :-)

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964943#comment-15964943
 ] 

Ariel Weisberg edited comment on CASSANDRA-13265 at 4/13/17 4:48 PM:
-

There seem to be some build issues in various branches? Maybe because I rebased?

You should register with CircleCI so it will automatically build and run the 
unit tests for you out of your repo when you commit. When you rebase there will 
be a circle.yml in each branch that will automatically have it run the build.

||Code|utests|dtests||
|[2.2|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-2.2]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-2%2E2]||
|[3.0|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.0]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E0]||
|[3.11|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.11]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E11]||
|[trunk|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-trunk]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-trunk]||



was (Author: aweisberg):
There seem to be some build issues in various branches? Maybe because I rebased?

You should register with CircleCI so it will automatically build and run the 
unit tests for you out of your repo when you commit. When you rebase there will 
be a circle.yml in each branch that will automatically have it run the build.

||Code|utests|dtests||
|[2.2|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-2.2]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-2%2E2]||
|[3.0|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.0]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E0]||
|[3.11|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.11]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E11]||
|[trunk|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-trunk]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13625-trunk]||


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967870#comment-15967870
 ] 

Ariel Weisberg commented on CASSANDRA-13265:


Nevermind I'll run them. I have to anyways for the dtests until we get the 
dtests running in CircleCI. I updated my copies. 

For CHANGES.TXT the entry should go at the top of the list of entries for the 
version the change is for. I don't know why.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968065#comment-15968065
 ] 

Ben Manes commented on CASSANDRA-13445:
---

If you can make a reproducible unit test that would help. The lambda should be 
non-blocking (the onAccess(node) method) as it only increments a counter and 
reorders in a linked list. Those data structures are not concurrent and have no 
blocking behavior. The other possibility is it is infinitely looping in 
BoundedBuffer because somehow the head overlapped the tail index (a 
single-consumer / multi-producer queue). But the loop breaks if it reads a null 
slot, assuming the entry isn't visible yet, so that race would be benign if 
discovered. So glancing at the code nothing stands out and a failing unit test 
would be very helpful.

> validation executor thread is stuck
> ---
>
> Key: CASSANDRA-13445
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13445
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cassandra 3.10
>Reporter: Roland Otta
>
> we have the following issue on our 3.10 development cluster.
> sometimes the repairs (it is a full repair in that case) hang because
> of a stuck validation compaction.
> nodetool compactionstats says 
> a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80% 
> and there is no more progress at this percentage.
> i checked the logs on the affected node and could not find any
> suspicious errors.
> a thread dump shows that the validation executor threads is always repeating 
> stuff in 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> here is the full stack trace
> {noformat}
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
>  Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
>  Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.ap

[jira] [Created] (CASSANDRA-13447) dtest failure in ttl_test.TestTTL.collection_list_ttl_test

2017-04-13 Thread Sean McCarthy (JIRA)
Sean McCarthy created CASSANDRA-13447:
-

 Summary: dtest failure in ttl_test.TestTTL.collection_list_ttl_test
 Key: CASSANDRA-13447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13447
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Sean McCarthy
 Attachments: node1_debug.log, node1_gc.log, node1.log

example failure:

http://cassci.datastax.com/job/cassandra-2.2_offheap_dtest/487/testReport/ttl_test/TestTTL/collection_list_ttl_test

{code}
Error Message

Error from server: code=2200 [Invalid query] message="Attempted to set an 
element on a list which is null"
{code}{code}
Stacktrace

  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/automaton/cassandra-dtest/ttl_test.py", line 264, in 
collection_list_ttl_test
""")
  File "/home/automaton/venv/src/cassandra-driver/cassandra/cluster.py", line 
2018, in execute
return self.execute_async(query, parameters, trace, custom_payload, 
timeout, execution_profile, paging_state).result()
  File "/home/automaton/venv/src/cassandra-driver/cassandra/cluster.py", line 
3822, in result
raise self._final_exception
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13448) Possible divide by 0 in 2i

2017-04-13 Thread Jeff Jirsa (JIRA)
Jeff Jirsa created CASSANDRA-13448:
--

 Summary: Possible divide by 0 in 2i
 Key: CASSANDRA-13448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13448
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeff Jirsa
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 3.0.x, 3.11.x, 4.x


Possible divide by zero issue in  
{{org.apache.cassandra.index.SecondaryIndexManager.calculateIndexingPageSize}} 
if {{columnsPerRow}} evaluates to 0. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13448) Possible divide by 0 in 2i

2017-04-13 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13448:
---
Description: Possible divide by zero issue in  
{{org.apache.cassandra.index.SecondaryIndexManager.calculateIndexingPageSize}} 
if {{columnsPerRow}} evaluates to 0 (table without non-primary-key columns and 
a secondary index will throw an exception when that index is rebuilt ).  (was: 
Possible divide by zero issue in  
{{org.apache.cassandra.index.SecondaryIndexManager.calculateIndexingPageSize}} 
if {{columnsPerRow}} evaluates to 0. )

> Possible divide by 0 in 2i
> --
>
> Key: CASSANDRA-13448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Possible divide by zero issue in  
> {{org.apache.cassandra.index.SecondaryIndexManager.calculateIndexingPageSize}}
>  if {{columnsPerRow}} evaluates to 0 (table without non-primary-key columns 
> and a secondary index will throw an exception when that index is rebuilt ).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CASSANDRA-13448) Possible divide by 0 in 2i

2017-04-13 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa resolved CASSANDRA-13448.

Resolution: Duplicate

> Possible divide by 0 in 2i
> --
>
> Key: CASSANDRA-13448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Possible divide by zero issue in  
> {{org.apache.cassandra.index.SecondaryIndexManager.calculateIndexingPageSize}}
>  if {{columnsPerRow}} evaluates to 0 (table without non-primary-key columns 
> and a secondary index will throw an exception when that index is rebuilt ).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13430) Cleanup isIncremental/repairedAt usage

2017-04-13 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13430:

 Reviewer: Marcus Eriksson
Fix Version/s: 4.0
   Status: Patch Available  (was: Open)

| [trunk|https://github.com/bdeggleston/cassandra/tree/13430] | 
[utests|https://circleci.com/gh/bdeggleston/cassandra/4] |

I ran the repair dtests locally with no issues

> Cleanup isIncremental/repairedAt usage
> --
>
> Key: CASSANDRA-13430
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13430
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> Post CASSANDRA-9143, there's no longer a reason to pass around 
> {{isIncremental}} or {{repairedAt}} in streaming sessions, as well as some 
> places in repair. The {{pendingRepair}} & {{repairedAt}} values should only 
> be set at the beginning/finalize stages of incremental repair and just follow 
> sstables around as they're streamed. Keeping these values with sstables also 
> fixes an edge case where you could leak repaired data back into unrepaired if 
> you run full and incremental repairs concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Roland Otta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968234#comment-15968234
 ] 

Roland Otta commented on CASSANDRA-13445:
-

btw: the issue also occurs when trying to scrub the sstable via nodetool scrub

after that i have 2 hanging compactions at the same stage

{noformat}
id   compaction type keyspace table
completed total unit  progress
2505fe00-207d-11e7-ad57-c9e86a8710f5 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%  
6c6654a0-208e-11e7-ad57-c9e86a8710f5 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80% 
{noformat}

also the stack trace looks quite similar

{noformat}
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:481)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:609)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:526)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:110)
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173)
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:135)
org.apache.cassandra.io

[jira] [Commented] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.

2017-04-13 Thread Matt Byrd (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968247#comment-15968247
 ] 

Matt Byrd commented on CASSANDRA-13307:
---

Hey [~michaelsembwever] Did you still want me to take a look? 
sounds like the failures can be explained by flakiness? 

> The specification of protocol version in cqlsh means the python driver 
> doesn't automatically downgrade protocol version.
> 
>
> Key: CASSANDRA-13307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Matt Byrd
>Assignee: Matt Byrd
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 3.11.x
>
>
> Hi,
> Looks like we've regressed on the issue described in:
> https://issues.apache.org/jira/browse/CASSANDRA-9467
> In that we're no longer able to connect from newer cqlsh versions
> (e.g trunk) to older versions of Cassandra with a lower version of the 
> protocol (e.g 2.1 with protocol version 3)
> The problem seems to be that we're relying on the ability for the client to 
> automatically downgrade protocol version implemented in Cassandra here:
> https://issues.apache.org/jira/browse/CASSANDRA-12838
> and utilised in the python client here:
> https://datastax-oss.atlassian.net/browse/PYTHON-240
> The problem however comes when we implemented:
> https://datastax-oss.atlassian.net/browse/PYTHON-537
> "Don't downgrade protocol version if explicitly set" 
> (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of 
> fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534)
> Since we do explicitly specify the protocol version in the bin/cqlsh.py.
> I've got a patch which just adds an option to explicitly specify the protocol 
> version (for those who want to do that) and then otherwise defaults to not 
> setting the protocol version, i.e using the protocol version from the client 
> which we ship, which should by default be the same protocol as the server.
> Then it should downgrade gracefully as was intended. 
> Let me know if that seems reasonable.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Roland Otta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968234#comment-15968234
 ] 

Roland Otta edited comment on CASSANDRA-13445 at 4/13/17 9:37 PM:
--

btw: the issue also occurs when trying to scrub the sstable via nodetool scrub

after that i have 2 hanging compactions at the same stage

{noformat}
id   compaction type keyspace table
completed total unit  progress
2505fe00-207d-11e7-ad57-c9e86a8710f5 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%  
6c6654a0-208e-11e7-ad57-c9e86a8710f5 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80% 
{noformat}

also the stack trace looks quite similar

{noformat}
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:481)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:609)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:526)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:110)
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173)
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTable

[jira] [Comment Edited] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Roland Otta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968234#comment-15968234
 ] 

Roland Otta edited comment on CASSANDRA-13445 at 4/13/17 9:36 PM:
--

btw: the issue also occurs when trying to scrub the sstable via nodetool scrub

after that i have 2 hanging compactions at the same stage

{noformat}
id   compaction type keyspace table
completed total unit  progress
2505fe00-207d-11e7-ad57-c9e86a8710f5 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%  
6c6654a0-208e-11e7-ad57-c9e86a8710f5 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80% 
{noformat}

also the stack trace looks quite similar

{noformat}
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:481)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:609)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:526)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:110)
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173)
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTable

[jira] [Commented] (CASSANDRA-13445) validation executor thread is stuck

2017-04-13 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968288#comment-15968288
 ] 

Ben Manes commented on CASSANDRA-13445:
---

Perhaps this was probably supposed to be {{!=}}, since {{reference()}} 
increasing the count or returns {{null}} if zero?
{code}
do
buf = cache.get(new Key(source, pageAlignedPos)).reference();
while (buf == null);
{code}

> validation executor thread is stuck
> ---
>
> Key: CASSANDRA-13445
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13445
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cassandra 3.10
>Reporter: Roland Otta
>
> we have the following issue on our 3.10 development cluster.
> sometimes the repairs (it is a full repair in that case) hang because
> of a stuck validation compaction.
> nodetool compactionstats says 
> a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80% 
> and there is no more progress at this percentage.
> i checked the logs on the affected node and could not find any
> suspicious errors.
> a thread dump shows that the validation executor threads is always repeating 
> stuff in 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> here is the full stack trace
> {noformat}
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
>  Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
>  Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowI

[jira] [Assigned] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2017-04-13 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown reassigned CASSANDRA-10735:
---

Assignee: Jason Brown  (was: Norman Maurer)

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
> Fix For: 3.11.x
>
> Attachments: nettysslbench.png, nettysslbench_small.png, 
> nettyssl-bench.tgz, netty-ssl-trunk.tgz, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2017-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968420#comment-15968420
 ] 

Jason Brown commented on CASSANDRA-10735:
-

Once CASSANDRA-8457 is commited, this will be very easy to me to do (especially 
as the changes to {{SSLFactory}} in CASSANDRA-8457 will have 90% of the 
functionality needed here)

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
> Fix For: 4.0
>
> Attachments: nettysslbench.png, nettysslbench_small.png, 
> nettyssl-bench.tgz, netty-ssl-trunk.tgz, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-10735) Support netty openssl (netty-tcnative) for client encryption

2017-04-13 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-10735:

Fix Version/s: (was: 3.11.x)
   4.0

> Support netty openssl (netty-tcnative) for client encryption
> 
>
> Key: CASSANDRA-10735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10735
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Andy Tolbert
>Assignee: Jason Brown
> Fix For: 4.0
>
> Attachments: nettysslbench.png, nettysslbench_small.png, 
> nettyssl-bench.tgz, netty-ssl-trunk.tgz, sslbench12-03.png
>
>
> The java-driver recently added support for using netty openssl via 
> [netty-tcnative|http://netty.io/wiki/forked-tomcat-native.html] in 
> [JAVA-841|https://datastax-oss.atlassian.net/browse/JAVA-841], this shows a 
> very measured improvement (numbers incoming on that ticket).   It seems 
> likely that this can offer improvement if implemented C* side as well.
> Since netty-tcnative has platform specific requirements, this should not be 
> made the default, but rather be an option that one can use.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2017-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968472#comment-15968472
 ] 

Jason Brown commented on CASSANDRA-8457:


So, [~aweisberg] and I spent some time talking offline about the expiring 
messages on the outbound side, and came up with the following: 

1. run a periodic, scheduled task in each channel that checks to make sure the 
channel is making progess wrt sending bytes. If we fail to see any progress 
being made after some number of seconds, we should close the connection/socket 
and throw away the messages.
2. repurpose the high/low water mark (and arguably use it more correctly) to 
indicate when we should stop writing messages to the channel (at the 
{{ChannelWriter}} layer). Currently, I'm just using the water mark to indicate 
when we should flush, but a simple check elsewhere would accomplish the same 
thing. Instead, the water marks should indicate when we really shouldn't write 
to the channel anymore, and either queue up those messages in something like 
{{OutboundMessageConnection#backlog}} or perhaps drop them (I'd prefer to 
queue).
3. When we've exceeded the high water mark, we can disable the reading incoming 
messages from the same peer (achievable by disabiling auto read for the 
channel). This would prevent the current node from executing more work on 
behalf of a peer to which we cannot send any data. Then when the channel drops 
below the low water mark (and the channel is 'writable'), we re-enable netty 
auto read on the read channels for the pper.

1 and 2 are reasonably easy to do (and I'll do them asap), but I'd prefer to 
defer 3 until later as it has a lot of races and other complexities/subtleties 
I'd like to put off for the scope of this ticket (especially as sockets are not 
bidirectional yet). Thoughts?

Note: items 1 & 2 are significantly simpler than my earlier comments wrt 
message expiration, so please disregard them for now.


> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Jason Brown
>Priority: Minor
>  Labels: netty, performance
> Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13449) Cannot replace address with a node that is already bootstrapped

2017-04-13 Thread Vinod (JIRA)
Vinod created CASSANDRA-13449:
-

 Summary: Cannot replace address with a node that is already 
bootstrapped
 Key: CASSANDRA-13449
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13449
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: ubuntu 16
Reporter: Vinod
 Fix For: 3.9


One of the existing node in a 5 node cassandra (3.9) cluster went down and 
fails to come up even on restart

I noticed the node to be down and tried to restart using the command

service cassandra restart

But the node fails to come and i see the below exception in system.log

ERROR [main] 2017-04-14 10:03:49,959 CassandraDaemon.java:747 - Exception 
encountered during startup java.lang.RuntimeException: Cannot replace address 
with a node that is already bootstrapped at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:752)
 ~[apache-cassandra-3.9.jar:3.9] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:648) 
~[apache-cassandra-3.9.jar:3.9] at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:548) 
~[apache-cassandra-3.9.jar:3.9] at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:385) 
[apache-cassandra-3.9.jar:3.9] at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601) 
[apache-cassandra-3.9.jar:3.9] at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730) 
[apache-cassandra-3.9.jar:3.9] WARN [StorageServiceShutdownHook] 2017-04-14 
10:03:49,963 Gossiper.java:1508 - No local state or state is in silent 
shutdown, not announcing shutdown WARN [StorageServiceShutdownHook] 2017-04-14 
10:51:49,539 Gossiper.java:1508 - No local state or state is in silent 
shutdown, not announcing shutdown


 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.

2017-04-13 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968688#comment-15968688
 ] 

mck commented on CASSANDRA-13307:
-

{quote}Hey mck Did you still want me to take a look? {quote}
No [~mbyrd], flakiness on that particular build configuration on the asf 
jenkins is to blame.

I will push the commit as soon as trunk it green again.

> The specification of protocol version in cqlsh means the python driver 
> doesn't automatically downgrade protocol version.
> 
>
> Key: CASSANDRA-13307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Matt Byrd
>Assignee: Matt Byrd
>Priority: Minor
>  Labels: doc-impacting
> Fix For: 3.11.x
>
>
> Hi,
> Looks like we've regressed on the issue described in:
> https://issues.apache.org/jira/browse/CASSANDRA-9467
> In that we're no longer able to connect from newer cqlsh versions
> (e.g trunk) to older versions of Cassandra with a lower version of the 
> protocol (e.g 2.1 with protocol version 3)
> The problem seems to be that we're relying on the ability for the client to 
> automatically downgrade protocol version implemented in Cassandra here:
> https://issues.apache.org/jira/browse/CASSANDRA-12838
> and utilised in the python client here:
> https://datastax-oss.atlassian.net/browse/PYTHON-240
> The problem however comes when we implemented:
> https://datastax-oss.atlassian.net/browse/PYTHON-537
> "Don't downgrade protocol version if explicitly set" 
> (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of 
> fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534)
> Since we do explicitly specify the protocol version in the bin/cqlsh.py.
> I've got a patch which just adds an option to explicitly specify the protocol 
> version (for those who want to do that) and then otherwise defaults to not 
> setting the protocol version, i.e using the protocol version from the client 
> which we ship, which should by default be the same protocol as the server.
> Then it should downgrade gracefully as was intended. 
> Let me know if that seems reasonable.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)