[jira] [Commented] (CASSANDRA-7739) cassandra-stress: cannot handle "value-less" tables

2015-11-16 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008195#comment-15008195
 ] 

ZhaoYang commented on CASSANDRA-7739:
-

I think the problem is that stress tool always execute Update query. The good 
thing about update query is that it also supports counter. but update doesn't 
work with all-key-table.

> cassandra-stress: cannot handle "value-less" tables
> ---
>
> Key: CASSANDRA-7739
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7739
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>  Labels: lhf, stress
> Fix For: 2.1.x
>
>
> Given a table, that only has primary-key columns, cassandra-stress fails with 
> this exception.
> The bug is, that 
> https://github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/StressProfile.java#L281
>  always adds the {{SET}} even if there are no "value columns" to update.
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: 
> InvalidRequestException(why:line 1:24 no viable alternative at input 'WHERE')
>   at 
> org.apache.cassandra.stress.StressProfile.getInsert(StressProfile.java:352)
>   at 
> org.apache.cassandra.stress.settings.SettingsCommandUser$1.get(SettingsCommandUser.java:66)
>   at 
> org.apache.cassandra.stress.settings.SettingsCommandUser$1.get(SettingsCommandUser.java:62)
>   at 
> org.apache.cassandra.stress.operations.SampledOpDistributionFactory$1.get(SampledOpDistributionFactory.java:76)
>   at 
> org.apache.cassandra.stress.StressAction$Consumer.(StressAction.java:248)
>   at org.apache.cassandra.stress.StressAction.run(StressAction.java:188)
>   at org.apache.cassandra.stress.StressAction.warmup(StressAction.java:92)
>   at org.apache.cassandra.stress.StressAction.run(StressAction.java:62)
>   at org.apache.cassandra.stress.Stress.main(Stress.java:109)
> Caused by: InvalidRequestException(why:line 1:24 no viable alternative at 
> input 'WHERE')
>   at 
> org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:52282)
>   at 
> org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:52259)
>   at 
> org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:52198)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>   at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1797)
>   at 
> org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1783)
>   at 
> org.apache.cassandra.stress.util.SimpleThriftClient.prepare_cql3_query(SimpleThriftClient.java:79)
>   at 
> org.apache.cassandra.stress.StressProfile.getInsert(StressProfile.java:348)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10422) Avoid anticompaction when doing subrange repair

2015-11-16 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-10422:

Reviewer: Marcus Eriksson

> Avoid anticompaction when doing subrange repair
> ---
>
> Key: CASSANDRA-10422
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10422
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Marcus Eriksson
>Assignee: Ariel Weisberg
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> If we do split the owned range in say 1000 parts, and then do one repair 
> each, we could potentially anticompact every sstable 1000 times (ie, we 
> anticompact the repaired range out 1000 times). We should avoid 
> anticompacting at all in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2015-11-16 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008121#comment-15008121
 ] 

ZhaoYang edited comment on CASSANDRA-9043 at 11/17/15 7:19 AM:
---

This is the patch for Cassandra-2.1.8 and trunk


was (Author: jasonstack):
This is the patch for Cassandra-2.1.8

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>  Labels: lhf
> Fix For: 2.1.8
>
> Attachments: CASSANDRA-9043(2.1.8).patch, CASSANDRA-9043-trunk.patch
>
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10464) "nodetool compactionhistory" output should be sorted on compacted_at column and the timestamp shown in human readable format

2015-11-16 Thread Michael Edge (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008177#comment-15008177
 ] 

Michael Edge commented on CASSANDRA-10464:
--

Patch attached. I made the changes in CompactionHistory.java to ensure only 
NodeTool is impacted. It would have been easier to change 
CompactionHistoryTabularData.java, since this is where the data retrieved from 
the system table is formatted, but I was unsure whether this would impact any 
other consumers or JMX users/apps.

My first Cassandra patch - please be gentle...

> "nodetool compactionhistory" output should be sorted on compacted_at column 
> and the timestamp shown in human readable format
> 
>
> Key: CASSANDRA-10464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Wei Deng
>Priority: Minor
> Fix For: 3.x
>
> Attachments: CASSANDRA-10464-CompactionHistory.patch
>
>
> "nodetool compactionhistory" (introduced in CASSANDRA-5078) is a useful tool 
> for Cassandra DBAs. However, the current output limits its usefulness without 
> some additional parsing.
> We should improve it in the following two areas:
> 1. The output should be sorted on the compacted_at column, so that the most 
> recently finished compaction will show up last (which is what the DBAs would 
> expect);
> 2. The compacted_at column should be printed in human-readable timestamp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10464) "nodetool compactionhistory" output should be sorted on compacted_at column and the timestamp shown in human readable format

2015-11-16 Thread Michael Edge (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Edge updated CASSANDRA-10464:
-
Attachment: CASSANDRA-10464-CompactionHistory.patch

> "nodetool compactionhistory" output should be sorted on compacted_at column 
> and the timestamp shown in human readable format
> 
>
> Key: CASSANDRA-10464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Wei Deng
>Priority: Minor
> Fix For: 3.x
>
> Attachments: CASSANDRA-10464-CompactionHistory.patch
>
>
> "nodetool compactionhistory" (introduced in CASSANDRA-5078) is a useful tool 
> for Cassandra DBAs. However, the current output limits its usefulness without 
> some additional parsing.
> We should improve it in the following two areas:
> 1. The output should be sorted on the compacted_at column, so that the most 
> recently finished compaction will show up last (which is what the DBAs would 
> expect);
> 2. The compacted_at column should be printed in human-readable timestamp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2015-11-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-9043:

Attachment: CASSANDRA-9043(2.1.8).patch
CASSANDRA-9043-trunk.patch

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>  Labels: lhf
> Fix For: 2.1.8
>
> Attachments: CASSANDRA-9043(2.1.8).patch, CASSANDRA-9043-trunk.patch
>
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2015-11-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-9043:

Attachment: (was: CASSANDRA-9043.patch)

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>  Labels: lhf
> Fix For: 2.1.8
>
> Attachments: CASSANDRA-9043(2.1.8).patch, CASSANDRA-9043-trunk.patch
>
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2015-11-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-9043:

Attachment: CASSANDRA-9043.patch

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>  Labels: lhf
> Fix For: 2.1.8
>
> Attachments: CASSANDRA-9043.patch
>
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-9043) Improve COPY command to work with Counter columns

2015-11-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-9043:
---

Assignee: ZhaoYang

> Improve COPY command to work with Counter columns
> -
>
> Key: CASSANDRA-9043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9043
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>Assignee: ZhaoYang
>  Labels: lhf
>
> Noticed today that the copy command doesn't work with counter column tables.
> This makes sense given that we need to use UPDATE instead of INSERT with 
> counters.
> Given that we're making improvements in the COPY command in 3.0 with 
> CASSANDRA-7405, can we also tweak it to work with counters?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10632) sstableutil tests failing

2015-11-16 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007942#comment-15007942
 ] 

Joel Knighton commented on CASSANDRA-10632:
---

[~mambocab]

There's another round of text fixes before we will find any C* bugs here.

If you look at the Windows failures, they're actually occurring in 
{{_check_files}} before a cleanup is even attempted.

It expects no temporary files at this point, which is an error, as we know 
there should be some. If we look at how this list of temporary files is 
created, we see it is the set difference of all files and final files.  From 
debug output, we see there are 75 files total and 72 final files, so this 
difference should find three expected files.

If we look at all files and final files after the Windows path handling, it is 
clear a bad and terrifying thing has happened. After Windows file mangling, we 
get paths like:

{code}
dtest: DEBUG: ['g-crc.db', 'g-data.db', 'g-digest.crc32', 'g-filter.db', 
'g-index.db', 'g-statistics.db', 'g-summary.db', 'g-toc.txt', 'g-crc.db', 
'g-data.db', 'g-digest.crc32', 'g-filter.db', 'g-index.db', 'g-statistics.db', 
'g-summary.db', 'g-toc.txt', 'g-crc.db', 'g-data.db', 'g-digest.crc32', 
'g-filter.db', 'g-index.db', 'g-statistics.db', 'g-summary.db', 'g-toc.txt', 
'4-big-crc.db', '4-big-data.db', '4-big-digest.crc32', '4-big-filter.db', 
'4-big-index.db', '4-big-statistics.db', '4-big-summary.db', '4-big-toc.txt', 
'g-crc.db', 'g-data.db', 'g-digest.crc32', 'g-filter.db', 'g-index.db', 
'g-statistics.db', 'g-summary.db', 'g-toc.txt', 'g-crc.db', 'g-data.db', 
'g-digest.crc32', 'g-filter.db', 'g-index.db', 'g-statistics.db', 
'g-summary.db', 'g-toc.txt', '7-big-crc.db', '7-big-data.db', 
'7-big-digest.crc32', '7-big-filter.db', '7-big-index.db', 
'7-big-statistics.db', '7-big-summary.db', '7-big-toc.txt', 'g-crc.db', 
'g-data.db', 'g-digest.crc32', 'g-filter.db', 'g-index.db', 'g-statistics.db', 
'g-summary.db', 'g-toc.txt', 'g-crc.db', 'g-data.db', 'g-digest.crc32', 
'g-filter.db', 'g-index.db', 'g-statistics.db', 'g-summary.db', 'g-toc.txt']
{code}

This is because lstrip doesn't strip a string from the left, but it in fact 
strips a string as long as the next character is present in the string given as 
an argument.  {{_strip_common_prefix}} needs to be fixed to use a different 
method to strip strings from the left based on absolute match - I suggest just 
removing the number of characters that is the length of the common prefix.

After that, in the comparison, {{expected_tmpfiles}} in {{_check_files}} isn't 
processed the same way, so comparisons will still fail. 
{{_strip_common_prefix}} doesn't work here, since it will likely only contain 
files for one sstable, so the common prefix will match more than expected.

Once these are fixed, I can take another look if there still seem to be any 
problems.





> sstableutil tests failing
> -
>
> Key: CASSANDRA-10632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10632
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Jim Witschey
> Fix For: 3.0.1, 3.1
>
>
> {{sstableutil_test.py:SSTableUtilTest.abortedcompaction_test}} and 
> {{sstableutil_test.py:SSTableUtilTest.compaction_test}} fail on Windows:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/sstableutil_test/SSTableUtilTest/abortedcompaction_test/
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/sstableutil_test/SSTableUtilTest/compaction_test/
> This is a pretty simple failure -- looks like the underlying behavior is ok, 
> but string comparison fails when the leading {{d}} in the filename is 
> lowercase as returned by {{sstableutil}} (see the [{{_invoke_sstableutil}} 
> test 
> function|https://github.com/riptano/cassandra-dtest/blob/master/sstableutil_test.py#L128]),
>  but uppercase as returned by {{glob.glob}} (see the [{{_get_sstable_files}} 
> test 
> function|https://github.com/riptano/cassandra-dtest/blob/master/sstableutil_test.py#L160]).
> Do I understand correctly that Windows filenames are case-insensitive, 
> including the drive portion? If that's the case, then we can just lowercase 
> the file names in the test helper functions above when the tests are run on 
> Windows. [~JoshuaMcKenzie] can you confirm? I'll fix this in the tests if so. 
> If I'm wrong, and something in {{sstableutil}} needs to be fixed, could you 
> find an assignee?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10677) Improve performance of folderSize function

2015-11-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007936#comment-15007936
 ] 

Stefania commented on CASSANDRA-10677:
--

I've rebased the patch on 3.0 and added a unit test, see [link 
attached|https://github.com/stef1927/cassandra/tree/10677-3.0]. 

Started following CI jobs:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-3.0-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-3.0-windows-utest_win32/

Because the code is only used by unit tests or _nodetool listsnapshots_ dtests 
will not exercise this code and so I did not launch them.

I've repeated some basic bench-marking confirming the initial observation that 
the new method is about twice as fast as before. It also handles correctly 
invalid parameters such as files or non existing folders whereas the old 
implementation would have thrown a null pointer exception. Another difference 
is that it does not follow symbolic links, which I believe is the correct thing 
to do.

One more observation is that nether the new code, nor the old code include the 
folder descriptors in the space calculations, therefore the value returned is 
slightly less than what's returned by {{du -sb}}. This can be easily rectified 
with the new implementation provided by this patch but it is not presently done 
since I did not want to alter existing behavior.

If unit tests complete without problems we can commit this, I will post another 
update then.

> Improve performance of folderSize function
> --
>
> Key: CASSANDRA-10677
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10677
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
> Environment: Ubuntu 14. JDK 7
>Reporter: Briareus
>Priority: Minor
>  Labels: patch, performance
> Fix For: 3.x
>
> Attachments: 
> Optimized_folderSize_function_to_use_Java_7_nio_walkFileTree_method_.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> FileUtils.folderSize function recursively traverses the directory tree using 
> listFiles method. This is no longer efficient as Java 7 offers much better 
> Files.walkFileTree method. It makes the method work twice faster according to 
> my tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10477) java.lang.AssertionError in StorageProxy.submitHint

2015-11-16 Thread Hao Bryan Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007923#comment-15007923
 ] 

Hao Bryan Cheng commented on CASSANDRA-10477:
-

Just observed this issue again. Node was undergoing anticompaction when it 
occurred- once again brought the ring to a halt.

Couldn't get all the required information due to the urgency of the situation, 
but did confirm that nodetool status reported the node as up with no issue (on 
another node).

I have fresh logs to offer out-of-band to anyone who is investigating this 
issue- feel free to email or ping here.

> java.lang.AssertionError in StorageProxy.submitHint
> ---
>
> Key: CASSANDRA-10477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10477
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
> Environment: CentOS 6, Oracle JVM 1.8.45
>Reporter: Severin Leonhardt
>Assignee: Ariel Weisberg
> Fix For: 2.1.x
>
>
> A few days after updating from 2.0.15 to 2.1.9 we have the following log 
> entry on 2 of 5 machines:
> {noformat}
> ERROR [EXPIRING-MAP-REAPER:1] 2015-10-07 17:01:08,041 
> CassandraDaemon.java:223 - Exception in thread 
> Thread[EXPIRING-MAP-REAPER:1,5,main]
> java.lang.AssertionError: /192.168.11.88
> at 
> org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:949) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:383) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:363) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at org.apache.cassandra.utils.ExpiringMap$1.run(ExpiringMap.java:98) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_45]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_45]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_45]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {noformat}
> 192.168.11.88 is the broadcast address of the local machine.
> When this is logged the read request latency of the whole cluster becomes 
> very bad, from 6 ms/op to more than 100 ms/op according to OpsCenter. Clients 
> get a lot of timeouts. We need to restart the affected Cassandra node to get 
> back normal read latencies. It seems write latency is not affected.
> Disabling hinted handoff using {{nodetool disablehandoff}} only prevents the 
> assert from being logged. At some point the read latency becomes bad again. 
> Restarting the node where hinted handoff was disabled results in the read 
> latency being better again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9710) Stress tool cannot control insert batch size

2015-11-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang resolved CASSANDRA-9710.
-
Resolution: Fixed

batch size can be configured in stress.yaml

> Stress tool cannot control insert batch size
> 
>
> Key: CASSANDRA-9710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9710
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: ZhaoYang
>
> When defined a large CF with ~100 columns, then run stress tool to insert 
> data to cassandra. it reports exceeds default batch limit. There should be a 
> config to control insert batch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10711) NoSuchElementException when executing empty batch.

2015-11-16 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-10711:

Description: 
After upgrade to C* 3.0, it fails when executes empty batch:
{code}
java.util.NoSuchElementException: null
at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [apache-cassandra-3.0.0.jar:3.0.0]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_60]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-3.0.0.jar:3.0.0]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.0.jar:3.0.0]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
{code}

  was:
After upgrade to C* 3.0, it fails when executes empty batch:

java.util.NoSuchElementException: null
at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [apache-cassandra-3.0.0.jar:3.0.0]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_60]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-3.0.0.jar:3.0.0]
at org.apache.cassandra.concurrent.SEPW

[jira] [Updated] (CASSANDRA-10711) NoSuchElementException when executing empty batch.

2015-11-16 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-10711:

Component/s: CQL

> NoSuchElementException when executing empty batch.
> --
>
> Key: CASSANDRA-10711
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10711
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Cassandra 3.0, OSS 42.1
>Reporter: Jaroslav Kamenik
>  Labels: triaged
> Fix For: 3.0.1, 3.1
>
>
> After upgrade to C* 3.0, it fails when executes empty batch:
> {code}
> java.util.NoSuchElementException: null
> at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_60]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.0.jar:3.0.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10249) Make buffered read size configurable

2015-11-16 Thread Al Tobey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007642#comment-15007642
 ] 

Al Tobey commented on CASSANDRA-10249:
--

It looks like 2.2 has moved DEFAULT_BUFFER_SIZE to 4K with a max of 64K making 
this patch irrelevant for 2.2.

> Make buffered read size configurable
> 
>
> Key: CASSANDRA-10249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10249
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Albert P Tobey
>Assignee: Albert P Tobey
> Fix For: 2.1.x, 2.2.x, 3.0.x
>
> Attachments: Screenshot 2015-09-11 09.32.04.png, Screenshot 
> 2015-09-11 09.34.10.png, patched-2.1.9-dstat-lvn10.png, 
> stock-2.1.9-dstat-lvn10.png, yourkit-screenshot.png
>
>
> On read workloads, Cassandra 2.1 reads drastically more data than it emits 
> over the network. This causes problems throughput the system by wasting disk 
> IO and causing unnecessary GC.
> I have reproduce the issue on clusters and locally with a single instance. 
> The only requirement to reproduce the issue is enough data to blow through 
> the page cache. The default schema and data size with cassandra-stress is 
> sufficient for exposing the issue.
> With stock 2.1.9 I regularly observed anywhere from 300:1  to 500 
> disk:network ratio. That is to say, for 1MB/s of network IO, Cassandra was 
> doing 300-500MB/s of disk reads, saturating the drive.
> After applying this patch for standard IO mode 
> https://gist.github.com/tobert/10c307cf3709a585a7cf the ratio fell to around 
> 100:1 on my local test rig. Latency improved considerably and GC became a lot 
> less frequent.
> I tested with 512 byte reads as well, but got the same performance, which 
> makes sense since all HDD and SSD made in the last few years have a 4K block 
> size (many of them lie and say 512).
> I'm re-running the numbers now and will post them tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10711) NoSuchElementException when executing empty batch.

2015-11-16 Thread Andrew Hust (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Hust updated CASSANDRA-10711:

Labels: triaged  (was: )

> NoSuchElementException when executing empty batch.
> --
>
> Key: CASSANDRA-10711
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10711
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0, OSS 42.1
>Reporter: Jaroslav Kamenik
>  Labels: triaged
> Fix For: 3.0.1, 3.1
>
>
> After upgrade to C* 3.0, it fails when executes empty batch:
> java.util.NoSuchElementException: null
> at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_60]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.0.jar:3.0.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10716) cleanup - row key range like repair

2015-11-16 Thread Constance Eustace (JIRA)
Constance Eustace created CASSANDRA-10716:
-

 Summary: cleanup - row key range like repair
 Key: CASSANDRA-10716
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10716
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Constance Eustace
Priority: Minor


Although the need probably isn't the same as reducing the Merkle tree 
size/scope streaming problem, it would be nice to do subrange cleans so we can 
gauge statistical samples of disk space savings, or split it into many subtasks 
so we can track progress, or gradually perform it over time to reduce I/O 
impacts.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10716) nodetool cleanup - row key subrange cleanup like repair -st and -et

2015-11-16 Thread Constance Eustace (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constance Eustace updated CASSANDRA-10716:
--
Summary: nodetool cleanup - row key subrange cleanup like repair -st and 
-et  (was: cleanup - row key range like repair)

> nodetool cleanup - row key subrange cleanup like repair -st and -et
> ---
>
> Key: CASSANDRA-10716
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10716
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Constance Eustace
>Priority: Minor
>
> Although the need probably isn't the same as reducing the Merkle tree 
> size/scope streaming problem, it would be nice to do subrange cleans so we 
> can gauge statistical samples of disk space savings, or split it into many 
> subtasks so we can track progress, or gradually perform it over time to 
> reduce I/O impacts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10711) NoSuchElementException when executing empty batch.

2015-11-16 Thread Andrew Hust (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007570#comment-15007570
 ] 

Andrew Hust commented on CASSANDRA-10711:
-

- confirmed that 2.2 {{73a730f926d25a7d4f693507937b8565b701259c}} does not 
throw error
- confirmed both 3.0 {{c0480d8bbddf111e4cd7c67ef7c0daeec3ece2dc}} and trunk 
{{0010fce6d2c9a811eb66de077b69a83dce29a6ff}} throw same 
{{NoSuchElementException}} in cqlsh
- added [dtest|https://github.com/riptano/cassandra-dtest/pull/662] to verify 
fix when made

> NoSuchElementException when executing empty batch.
> --
>
> Key: CASSANDRA-10711
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10711
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0, OSS 42.1
>Reporter: Jaroslav Kamenik
> Fix For: 3.0.1, 3.1
>
>
> After upgrade to C* 3.0, it fails when executes empty batch:
> java.util.NoSuchElementException: null
> at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_60]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.0.jar:3.0.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10326) Performance is worse in 3.0

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007450#comment-15007450
 ] 

Ariel Weisberg commented on CASSANDRA-10326:


I am trying to run [this client 
workload|http://cstar.datastax.com/graph?stats=518e5484-5ee3-11e5-b421-42010af0688f&metric=99.9th_latency&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=865.37&ymin=0&ymax=158.51]
 on my laptop (server is on another box) and it is burning through the entire 
CPU. I profiled with flight recorder and it looks like it spends a lot of time 
o.a.c.stress.generate.PartitionIterator$MultiRowIterator.

It's not light on GC load either with frequent several hundred millisecond 
pauses.

I guess I will have to track down some beefier hardware, but it does make me 
wonder what is going on when we take measurements.

> Performance is worse in 3.0
> ---
>
> Key: CASSANDRA-10326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10326
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
> Fix For: 3.0.x
>
>
> Performance is generally turning out to be worse after 8099, despite a number 
> of unrelated performance enhancements being delivered. This isn't entirely 
> unexpected, given a great deal of time was spent optimising the old code, 
> however things appear worse than we had hoped.
> My expectation was that workloads making extensive use of CQL constructs 
> would be faster post-8099, however the latest tests performed with very large 
> CQL rows, including use of collections, still exhibit performance below that 
> of 2.1 and 2.2. 
> Eventually, as the dataset size grows large enough and the locality of access 
> is just right, the reduction in size of our dataset will yield a window 
> during which some users will perform better due simply to improved page cache 
> hit rates. We seem to see this in some of the tests. However we should be at 
> least as fast (and really faster) off the bat.
> The following are some large partition benchmark results, with as many as 40K 
> rows per partition, running LCS. There are a number of parameters we can 
> modify to see how behaviour changes and under what scenarios we might still 
> be faster, but the picture painted isn't brilliant, and is consistent, so we 
> should really try and figure out what's up before GA.
> [trades-with-flags (collections), 
> blade11b|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4387.02&ymin=0&ymax=122951.4]
> [trades-with-flags (collections), 
> blade11|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4424.75&ymin=0&ymax=130158.6]
> [trades (no collections), 
> blade11|http://cstar.datastax.com/graph?stats=9b7da48e-570c-11e5-90fe-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2682.46&ymin=0&ymax=142547.9]
> [~slebresne]: will you have time to look into this before GA?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10715) Filtering on NULL returns ReadFailure exception

2015-11-16 Thread Kishan Karunaratne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishan Karunaratne updated CASSANDRA-10715:
---
Description: 
This is an issue I first noticed through the C# driver, but I was able to repro 
on cqlsh, leading me to believe this is a Cassandra bug.

Given the following schema:
{noformat}
CREATE TABLE "TestKeySpace_4928dc892922"."coolMovies" (
unique_movie_title text,
movie_maker text,
director text,
list list,
"mainGuy" text,
"yearMade" int,
PRIMARY KEY ((unique_movie_title, movie_maker), director)
) WITH CLUSTERING ORDER BY (director ASC)
{noformat}

Executing a SELECT with FILTERING on a non-PK column, using a NULL as the 
argument:
{noformat}
SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
"yearMade" FROM "coolMovies" WHERE "mainGuy" = null ALLOW FILTERING
{noformat}

returns a ReadFailure exception:
{noformat}
cqlsh:TestKeySpace_4c8f2cf8d5cc> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mTraceback (most recent call last):
  File "C:\Users\Kishan\.ccm\repository\3.0.0\bin\\cqlsh.py", line 1216, in 
perform_simple_statement
result = future.result()
  File 
"C:\Users\Kishan\.ccm\repository\3.0.0\bin\..\lib\cassandra-driver-internal-only-3.0.0a3.post0-3f15725.zip\cassandra-driver-3.0.0a3.post0-3f15725\cassandra\cluster.py",
 line 3118, in result
raise self._final_exception
ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
failed - received 0 responses and 1 failures" info={'failures': 1, 
'received_responses': 0, 'required_responses': 1, 'cons
istency': 'ONE'}
←[0m
{noformat}

Cassandra log shows:
{noformat}
WARN  [SharedPool-Worker-2] 2015-11-16 13:51:00,259 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-2,10,main]: {}
java.lang.AssertionError: null
at 
org.apache.cassandra.db.filter.RowFilter$SimpleExpression.isSatisfiedBy(RowFilter.java:581)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.filter.RowFilter$CQLFilter$1IsSatisfiedFilter.applyToRow(RowFilter.java:243)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.transform.BaseRows.applyOne(BaseRows.java:95) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:86) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:21) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.transform.Transformation.add(Transformation.java:136) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:102) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.filter.RowFilter$CQLFilter$1IsSatisfiedFilter.applyToPartition(RowFilter.java:233)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.filter.RowFilter$CQLFilter$1IsSatisfiedFilter.applyToPartition(RowFilter.java:227)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:293)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:136)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:128)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:288) 
~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1692)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2346)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_60]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-3.0.0.jar:3.0.0]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.0.jar:3.0.0]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
{noformat}
In C* < 3.0.0 (such as 2.2.3)

[jira] [Commented] (CASSANDRA-10714) tcp retransmission issue seen in cassandra cluster

2015-11-16 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007390#comment-15007390
 ] 

Michael Shuler commented on CASSANDRA-10714:


This could be simple network issues with a NIC, switch port, or neighbor, so 
TCP does its job and retransmits packets. You might see error|dropped|overruns 
values in ifconfig, as well. If this persists on a particular node in your 
cluster, it may be prudent to destroy that node and launch another one in the 
hopes that you get a more stable server.

> tcp retransmission issue seen in cassandra cluster
> --
>
> Key: CASSANDRA-10714
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10714
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Jeff Liu
>
> I have been seen tcp package retransmission issue in various stacks in our 
> environment. ( AWS with no VPC). I'm currently using hsha rpc server type on 
> cassandra 2.1.6 version. The information captured by wireshark shows that the 
> retransmission happened both between client-to-server and server-to-server. 
> Even within a cluster that doesn't have any client traffic, sporadic 
> retransmission still happens.
> It's pretty easy to reproduce this issue by watching "netstat -s | grep 
> retrans". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10715) Filtering on NULL returns ReadFailure exception

2015-11-16 Thread Kishan Karunaratne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishan Karunaratne updated CASSANDRA-10715:
---
Description: 
This is an issue I first noticed through the C# driver, but I was able to repro 
on cqlsh, leading me to believe this is a Cassandra bug.

Given the following schema:
{noformat}
CREATE TABLE "TestKeySpace_4928dc892922"."coolMovies" (
unique_movie_title text,
movie_maker text,
director text,
list list,
"mainGuy" text,
"yearMade" int,
PRIMARY KEY ((unique_movie_title, movie_maker), director)
) WITH CLUSTERING ORDER BY (director ASC)
{noformat}

Executing a SELECT with FILTERING on a non-PK column, using a NULL as the 
argument:
{noformat}
SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
"yearMade" FROM "coolMovies" WHERE "mainGuy" = null ALLOW FILTERING
{noformat}

returns a ReadFailure exception:
{noformat}
cqlsh:TestKeySpace_4c8f2cf8d5cc> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mTraceback (most recent call last):
  File "C:\Users\Kishan\.ccm\repository\3.0.0\bin\\cqlsh.py", line 1216, in 
perform_simple_statement
result = future.result()
  File 
"C:\Users\Kishan\.ccm\repository\3.0.0\bin\..\lib\cassandra-driver-internal-only-3.0.0a3.post0-3f15725.zip\cassandra-driver-3.0.0a3.post0-3f15725\cassandra\cluster.py",
 line 3118, in result
raise self._final_exception
ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
failed - received 0 responses and 1 failures" info={'failures': 1, 
'received_responses': 0, 'required_responses': 1, 'cons
istency': 'ONE'}
←[0m
{noformat}

In C* < 3.0.0 (such as 2.2.3), this same query correctly returns:
{noformat}
cqlsh:TestKeySpace_3231cd551e49> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mInvalidRequest: code=2200 [Invalid query] message="Unsupported null 
value for indexed column mainGuy"←[0m
{noformat}

Not sure if related, but using a value for the argument instead of null returns 
0 rows in 3.0.0, but correctly returns an InvalidRequest exception in C* 2.2.3:

{noformat}
SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
"yearMade" FROM "coolMovies" WHERE "yearMade" = 100 ALLOW FILTERING
{noformat}

In C* 2.2.3:
{noformat}
cqlsh:TestKeySpace_4928dc892922> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"yearMade" = 100 ALLOW FILTERING;
←[0;1;31mInvalidRequest: code=2200 [Invalid query] message="No secondary 
indexes on the restricted columns support the provided operators: "←[0m
{noformat}

  was:
This is an issue I first noticed through the C# driver, but I was able to repro 
on cqlsh, leading me to believe this is a Cassandra bug.

Given the following schema:
{noformat}
CREATE TABLE "TestKeySpace_4928dc892922"."coolMovies" (
unique_movie_title text,
movie_maker text,
director text,
list list,
"mainGuy" text,
"yearMade" int,
PRIMARY KEY ((unique_movie_title, movie_maker), director)
) WITH CLUSTERING ORDER BY (director ASC)
{noformat}

Executing a SELECT with FILTERING on a non-PK column, using a NULL as the 
argument:
{noformat}
SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
"yearMade" FROM "coolMovies" WHERE "mainGuy" = null ALLOW FILTERING
{noformat}

returns a ReadFailure exception:
{noformat}
cqlsh:TestKeySpace_4c8f2cf8d5cc> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mTraceback (most recent call last):
  File "C:\Users\Kishan\.ccm\repository\3.0.0\bin\\cqlsh.py", line 1216, in 
perform_simple_statement
result = future.result()
  File 
"C:\Users\Kishan\.ccm\repository\3.0.0\bin\..\lib\cassandra-driver-internal-only-3.0.0a3.post0-3f15725.zip\cassandra-driver-3.0.0a3.post0-3f15725\cassandra\cluster.py",
 line 3118, in result
raise self._final_exception
ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
failed - received 0 responses and 1 failures" info={'failures': 1, 
'received_responses': 0, 'required_responses': 1, 'cons
istency': 'ONE'}
←[0m
{noformat}

In C* < 3.0.0 (such as 2.2.3), this same query correctly returns:
{noformat}
cqlsh:TestKeySpace_3231cd551e49> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mInvalidRequest: code=2200 [Invalid query] message="Unsupported null 
value for indexed column mainGuy"←[0m
{noformat}


> Filtering on NULL returns ReadFailure exception
> ---
>
> 

[jira] [Updated] (CASSANDRA-10715) Filtering on NULL returns ReadFailure exception

2015-11-16 Thread Kishan Karunaratne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishan Karunaratne updated CASSANDRA-10715:
---
Environment: C* 3.0.0 | cqlsh | C# driver 3.0.0beta2 | Windows 2012 R2  
(was: C* 3.0.0 | cqlsh | C# driver 3.0.0beta2)

> Filtering on NULL returns ReadFailure exception
> ---
>
> Key: CASSANDRA-10715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10715
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 3.0.0 | cqlsh | C# driver 3.0.0beta2 | Windows 2012 R2
>Reporter: Kishan Karunaratne
>
> This is an issue I first noticed through the C# driver, but I was able to 
> repro on cqlsh, leading me to believe this is a Cassandra bug.
> Given the following schema:
> {noformat}
> CREATE TABLE "TestKeySpace_4928dc892922"."coolMovies" (
> unique_movie_title text,
> movie_maker text,
> director text,
> list list,
> "mainGuy" text,
> "yearMade" int,
> PRIMARY KEY ((unique_movie_title, movie_maker), director)
> ) WITH CLUSTERING ORDER BY (director ASC)
> {noformat}
> Executing a SELECT with FILTERING on a non-PK column, using a NULL as the 
> argument:
> {noformat}
> SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
> "yearMade" FROM "coolMovies" WHERE "mainGuy" = null ALLOW FILTERING
> {noformat}
> returns a ReadFailure exception:
> {noformat}
> cqlsh:TestKeySpace_4c8f2cf8d5cc> SELECT "mainGuy", "movie_maker", 
> "unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
> "mainGuy" = null ALLOW FILTERING;
> ←[0;1;31mTraceback (most recent call last):
>   File "C:\Users\Kishan\.ccm\repository\3.0.0\bin\\cqlsh.py", line 1216, in 
> perform_simple_statement
> result = future.result()
>   File 
> "C:\Users\Kishan\.ccm\repository\3.0.0\bin\..\lib\cassandra-driver-internal-only-3.0.0a3.post0-3f15725.zip\cassandra-driver-3.0.0a3.post0-3f15725\cassandra\cluster.py",
>  line 3118, in result
> raise self._final_exception
> ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
> failed - received 0 responses and 1 failures" info={'failures': 1, 
> 'received_responses': 0, 'required_responses': 1, 'cons
> istency': 'ONE'}
> ←[0m
> {noformat}
> In C* < 3.0.0 (such as 2.2.3), this same query correctly returns:
> {noformat}
> cqlsh:TestKeySpace_3231cd551e49> SELECT "mainGuy", "movie_maker", 
> "unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
> "mainGuy" = null ALLOW FILTERING;
> ←[0;1;31mInvalidRequest: code=2200 [Invalid query] message="Unsupported null 
> value for indexed column mainGuy"←[0m
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10715) Filtering on NULL returns ReadFailure exception

2015-11-16 Thread Kishan Karunaratne (JIRA)
Kishan Karunaratne created CASSANDRA-10715:
--

 Summary: Filtering on NULL returns ReadFailure exception
 Key: CASSANDRA-10715
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10715
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 3.0.0 | cqlsh | C# driver 3.0.0beta2
Reporter: Kishan Karunaratne


This is an issue I first noticed through the C# driver, but I was able to repro 
on cqlsh, leading me to believe this is a Cassandra bug.

Given the following schema:
{noformat}
CREATE TABLE "TestKeySpace_4928dc892922"."coolMovies" (
unique_movie_title text,
movie_maker text,
director text,
list list,
"mainGuy" text,
"yearMade" int,
PRIMARY KEY ((unique_movie_title, movie_maker), director)
) WITH CLUSTERING ORDER BY (director ASC)
{noformat}

Executing a SELECT with FILTERING on a non-PK column, using a NULL as the 
argument:
{noformat}
SELECT "mainGuy", "movie_maker", "unique_movie_title", "list", "director", 
"yearMade" FROM "coolMovies" WHERE "mainGuy" = null ALLOW FILTERING
{noformat}

returns a ReadFailure exception:
{noformat}
cqlsh:TestKeySpace_4c8f2cf8d5cc> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mTraceback (most recent call last):
  File "C:\Users\Kishan\.ccm\repository\3.0.0\bin\\cqlsh.py", line 1216, in 
perform_simple_statement
result = future.result()
  File 
"C:\Users\Kishan\.ccm\repository\3.0.0\bin\..\lib\cassandra-driver-internal-only-3.0.0a3.post0-3f15725.zip\cassandra-driver-3.0.0a3.post0-3f15725\cassandra\cluster.py",
 line 3118, in result
raise self._final_exception
ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
failed - received 0 responses and 1 failures" info={'failures': 1, 
'received_responses': 0, 'required_responses': 1, 'cons
istency': 'ONE'}
←[0m
{noformat}

In C* < 3.0.0 (such as 2.2.3), this same query correctly returns:
{noformat}
cqlsh:TestKeySpace_3231cd551e49> SELECT "mainGuy", "movie_maker", 
"unique_movie_title", "list", "director", "yearMade" FROM "coolMovies" WHERE 
"mainGuy" = null ALLOW FILTERING;
←[0;1;31mInvalidRequest: code=2200 [Invalid query] message="Unsupported null 
value for indexed column mainGuy"←[0m
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10422) Avoid anticompaction when doing subrange repair

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007325#comment-15007325
 ] 

Ariel Weisberg commented on CASSANDRA-10422:


[~krummas] this is ready for review again.

> Avoid anticompaction when doing subrange repair
> ---
>
> Key: CASSANDRA-10422
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10422
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Marcus Eriksson
>Assignee: Ariel Weisberg
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> If we do split the owned range in say 1000 parts, and then do one repair 
> each, we could potentially anticompact every sstable 1000 times (ie, we 
> anticompact the repaired range out 1000 times). We should avoid 
> anticompacting at all in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007275#comment-15007275
 ] 

Ariel Weisberg edited comment on CASSANDRA-7217 at 11/16/15 9:06 PM:
-

Created https://datastax-oss.atlassian.net/browse/JAVA-992 for the suspected 
Java client driver issue.


was (Author: aweisberg):
Created https://datastax-oss.atlassian.net/browse/JAVA-992 for the Java 
suspected client driver issue.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007258#comment-15007258
 ] 

Ariel Weisberg edited comment on CASSANDRA-7217 at 11/16/15 9:01 PM:
-

I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is the separate driver issue of the degradation in performance when 
the number of pending requests is not high enough. I wouldn't expect that kind 
of drop off. Whether the request is pending at the client or languishing in a 
TCP buffer in the server shouldn't really matter. I haven't looked, but my 
guess is that when the driver reaches the limit the thread submitting a request 
goes to sleep, and then it is woken up again. This means that every request has 
to flow through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.


was (Author: aweisberg):
I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is separate driver issue of the degradation in performance when the 
number of pending requests is not high enough. I wouldn't expect that kind of 
drop off. Whether the request is pending at the client or languishing in a TCP 
buffer in the server shouldn't really matter. I haven't looked, but my guess is 
that when the driver reaches the limit the thread submitting a request goes to 
sleep, and then it is woken up again. This means that every request has to flow 
through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007275#comment-15007275
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

Created https://datastax-oss.atlassian.net/browse/JAVA-992 for the Java 
suspected client driver issue.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007258#comment-15007258
 ] 

Ariel Weisberg edited comment on CASSANDRA-7217 at 11/16/15 8:29 PM:
-

I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is separate driver issue of the degradation in performance when the 
number of pending requests is not high enough. I wouldn't expect that kind of 
drop off. Whether the request is pending at the client or languishing in a TCP 
buffer in the server shouldn't really matter. I haven't looked, but my guess is 
that when the driver reaches the limit the thread submitting a request goes to 
sleep, and then it is woken up again. This means that every request has to flow 
through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.


was (Author: aweisberg):
I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is separate driver issue of the degradation in performance when the 
number of pending requests is not high enough. I wouldn't expect that kind of 
drop off. Whether the request is pending at the client or languishing in a TCP 
buffer in the server shouldn't really matter. I haven't looked, but my guess is 
that when the driver reaches the limit the thread submitting a requests goes to 
sleep, and then it is woken up again. This means that every request has to flow 
through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007258#comment-15007258
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is separate driver issue of the degradation in performance when the 
number of pending requests is not high enough. I wouldn't expect that kind of 
drop off. Whether the request is pending at the client or languishing in a TCP 
buffer in the server shouldn't really matter. I haven't looked, but my guess is 
that when the driver reaches the limit the thread submitting a requests goes to 
sleep, and then it is woken up again. This means that every request has to flow 
through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10714) tcp retransmission issue seen in cassandra cluster

2015-11-16 Thread Jeff Liu (JIRA)
Jeff Liu created CASSANDRA-10714:


 Summary: tcp retransmission issue seen in cassandra cluster
 Key: CASSANDRA-10714
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10714
 Project: Cassandra
  Issue Type: Improvement
  Components: Streaming and Messaging
Reporter: Jeff Liu


I have been seen tcp package retransmission issue in various stacks in our 
environment. ( AWS with no VPC). I'm currently using hsha rpc server type on 
cassandra 2.1.6 version. The information captured by wireshark shows that the 
retransmission happened both between client-to-server and server-to-server. 
Even within a cluster that doesn't have any client traffic, sporadic 
retransmission still happens.

It's pretty easy to reproduce this issue by watching "netstat -s | grep 
retrans". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8879) Alter table on compact storage broken

2015-11-16 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007209#comment-15007209
 ] 

Aleksey Yeschenko commented on CASSANDRA-8879:
--

2.1, 2.2, and 3.0 versions:

||branch||testall||dtest||
|[8879-2.1|https://github.com/iamaleksey/cassandra/tree/8879-2.1]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-2.1-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-2.1-dtest]|
|[8879-2.2|https://github.com/iamaleksey/cassandra/tree/8879-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-2.2-dtest]|
|[8879-3.0|https://github.com/iamaleksey/cassandra/tree/8879-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-8879-3.0-dtest]|

> Alter table on compact storage broken
> -
>
> Key: CASSANDRA-8879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8879
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nick Bailey
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 2.1.x
>
> Attachments: 8879-2.0.txt
>
>
> In 2.0 HEAD, alter table on compact storage tables seems to be broken. With 
> the following table definition, altering the column breaks cqlsh and 
> generates a stack trace in the log.
> {noformat}
> CREATE TABLE settings (
>   key blob,
>   column1 blob,
>   value blob,
>   PRIMARY KEY ((key), column1)
> ) WITH COMPACT STORAGE
> {noformat}
> {noformat}
> cqlsh:OpsCenter> alter table settings ALTER column1 TYPE ascii ;
> TSocket read 0 bytes
> cqlsh:OpsCenter> DESC TABLE settings;
> {noformat}
> {noformat}
> ERROR [Thrift:7] 2015-02-26 17:20:24,640 CassandraDaemon.java (line 199) 
> Exception in thread Thread[Thrift:7,5,main]
> java.lang.AssertionError
> >...at 
> >org.apache.cassandra.cql3.statements.AlterTableStatement.announceMigration(AlterTableStatement.java:198)
> >...at 
> >org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:79)
> >...at 
> >org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
> >...at 
> >org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:175)
> >...at 
> >org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1958)
> >...at 
> >org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486)
> >...at 
> >org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470)
> >...at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> >...at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> >...at 
> >org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:204)
> >...at 
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >...at 
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >...at java.lang.Thread.run(Thread.java:724)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10713) Gate keeper to do rate limiter and qps cap

2015-11-16 Thread Jeff Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Liu updated CASSANDRA-10713:
-
Description: 
As cassandra becomes the primary noSQL data store in more and more production 
environment, have we thought about adding gate keeper features like rate 
limiter and qps cap, which will give a greater integration experience when 
implementing cassandra in a large services infrastructure.

Reliability has become more and more important in those days for service 
providers. cassandra, together with other SQL or noSQL solutions, provides the 
data layers that power the complete application stack. In today's distributed 
system frameworks, dependences across application layers have been largely 
reduced, however, as the central data repository, data store system sits on the 
critical path, and become more fragile especially for the real-time, large 
scale systems. Companies like Twitter, facebook has built gate keeper 
internally to protect their data store systems, what should we do for cassandra?

Thoughts?



  was:
As cassandra becomes the primary noSQL data store in more and more production 
environment, have we thought about adding gate keeper features like rate 
limiter and qps cap, which will give a greater integration experience when 
implementing cassandra in a large services infrastructure.

Thoughts?


> Gate keeper to do rate limiter and qps cap
> --
>
> Key: CASSANDRA-10713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10713
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeff Liu
>
> As cassandra becomes the primary noSQL data store in more and more production 
> environment, have we thought about adding gate keeper features like rate 
> limiter and qps cap, which will give a greater integration experience when 
> implementing cassandra in a large services infrastructure.
> Reliability has become more and more important in those days for service 
> providers. cassandra, together with other SQL or noSQL solutions, provides 
> the data layers that power the complete application stack. In today's 
> distributed system frameworks, dependences across application layers have 
> been largely reduced, however, as the central data repository, data store 
> system sits on the critical path, and become more fragile especially for the 
> real-time, large scale systems. Companies like Twitter, facebook has built 
> gate keeper internally to protect their data store systems, what should we do 
> for cassandra?
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10713) Gate keeper to do rate limiter and qps cap

2015-11-16 Thread Jeff Liu (JIRA)
Jeff Liu created CASSANDRA-10713:


 Summary: Gate keeper to do rate limiter and qps cap
 Key: CASSANDRA-10713
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10713
 Project: Cassandra
  Issue Type: Improvement
  Components: Configuration
Reporter: Jeff Liu


As cassandra becomes the primary noSQL data store in more and more production 
environment, have we thought about adding gate keeper features like rate 
limiter and qps cap, which will give a greater integration experience when 
implementing cassandra in a large services infrastructure.

Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9644) DTCS configuration proposals for handling consequences of repairs

2015-11-16 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007138#comment-15007138
 ] 

Philip Thompson commented on CASSANDRA-9644:


[~krummas], FYI, while running cstar perf jobs against the 9644 branch you gave 
me, I have seen pending compactions sit at ~2 for hours, with no compactions 
occurring. There may be a bug in that calculation.

> DTCS configuration proposals for handling consequences of repairs
> -
>
> Key: CASSANDRA-9644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9644
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Antti Nissinen
>Assignee: Marcus Eriksson
>  Labels: compaction, dtcs
> Fix For: 2.1.x, 3.x
>
> Attachments: node0_20150621_1646_time_graph.txt, 
> node0_20150621_2320_time_graph.txt, node0_20150623_1526_time_graph.txt, 
> node1_20150621_1646_time_graph.txt, node1_20150621_2320_time_graph.txt, 
> node1_20150623_1526_time_graph.txt, node2_20150621_1646_time_graph.txt, 
> node2_20150621_2320_time_graph.txt, node2_20150623_1526_time_graph.txt, 
> nodetool status infos.txt, sstable_compaction_trace.txt, 
> sstable_compaction_trace_snipped.txt, sstable_counts.jpg
>
>
> This is a document bringing up some issues when DTCS is used to compact time 
> series data in a three node cluster. The DTCS is currently configured with a 
> few parameters that are making the configuration fairly simple, but might 
> cause problems in certain special cases like recovering from the flood of 
> small SSTables due to repair operation. We are suggesting some ideas that 
> might be a starting point for further discussions. Following sections are 
> containing:
> - Description of the cassandra setup
> - Feeding process of the data
> - Failure testing
> - Issues caused by the repair operations for the DTCS
> - Proposal for the DTCS configuration parameters
> Attachments are included to support the discussion and there is a separate 
> section giving explanation for those.
> Cassandra setup and data model
> - Cluster is composed from three nodes running Cassandra 2.1.2. Replication 
> factor is two and read and write consistency levels are ONE.
> - Data is time series data. Data is saved so that one row contains a certain 
> time span of data for a given metric ( 20 days in this case). The row key 
> contains information about the start time of the time span and metrix name. 
> Column name gives the offset from the beginning of time span. Column time 
> stamp is set to correspond time stamp when adding together the timestamp from 
> the row key and the offset (the actual time stamp of data point). Data model 
> is analog to KairosDB implementation.
> - Average sampling rate is 10 seconds varying significantly from metric to 
> metric.
> - 100 000 metrics are fed to the Cassandra.
> - max_sstable_age_days is set to 5 days (objective is to keep SStable files 
> in manageable size, around 50 GB)
> - TTL is not in use in the test.
> Procedure for the failure test.
> - Data is first dumped to Cassandra for 11 days and the data dumping is 
> stopped so that DTCS will have a change to finish all compactions. Data is 
> dumped with "fake timestamps" so that column time stamp is set when data is 
> written to Cassandra.
> - One of the nodes is taken down and new data is dumped on top of the earlier 
> data covering couple of hours worth of data (faked time stamps).
> - Dumping is stopped and the node is kept down for few hours.
> - Node is taken up and the "nodetool repair" is applied on the node that was 
> down.
> Consequences
> - Repair operation will lead to massive amount of new SStables far back in 
> the history. New SStables are covering similar time spans than the files that 
> were created by DTCS before the shutdown of one of the nodes.
> - To be able to compact the small files the max_sstable_age_days should be 
> increased to allow compaction to handle the files. However, the in a 
> practical case the time window will increase so large that generated files 
> will be huge that is not desirable. The compaction also combines together one 
> very large file with a bunch of small files in several phases that is not 
> effective. Generating really large files may also lead to out of disc space 
> problems.
> - See the list of time graphs later in the document.
> Improvement proposals for the DTCS configuration
> Below is a list of desired properties for the configuration. Current 
> parameters are mentioned if available.
> - Initial window size (currently:base_time_seconds)
> - The amount of similar size windows for the bucketing (currently: 
> min_threshold)
> - The multiplier for the window size when increased (currently: 
> min_threshold). This we would like to be independent from

[jira] [Commented] (CASSANDRA-9258) Range movement causes CPU & performance impact

2015-11-16 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007120#comment-15007120
 ] 

Dikang Gu commented on CASSANDRA-9258:
--

[~iamaleksey] yes, I'm working on it, I'm trying to implement the pendingRanges 
based on the IntervalTree, will try to send out a patch this week.

> Range movement causes CPU & performance impact
> --
>
> Key: CASSANDRA-9258
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9258
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.4
>Reporter: Rick Branson
>Assignee: Dikang Gu
> Fix For: 2.1.x
>
>
> Observing big CPU & latency regressions when doing range movements on 
> clusters with many tens of thousands of vnodes. See CPU usage increase by 
> ~80% when a single node is being replaced.
> Top methods are:
> 1) Ljava/math/BigInteger;.compareTo in 
> Lorg/apache/cassandra/dht/ComparableObjectToken;.compareTo 
> 2) Lcom/google/common/collect/AbstractMapBasedMultimap;.wrapCollection in 
> Lcom/google/common/collect/AbstractMapBasedMultimap$AsMap$AsMapIterator;.next
> 3) Lorg/apache/cassandra/db/DecoratedKey;.compareTo in 
> Lorg/apache/cassandra/dht/Range;.contains
> Here's a sample stack from a thread dump:
> {code}
> "Thrift:50673" daemon prio=10 tid=0x7f2f20164800 nid=0x3a04af runnable 
> [0x7f2d878d]
>java.lang.Thread.State: RUNNABLE
>   at org.apache.cassandra.dht.Range.isWrapAround(Range.java:260)
>   at org.apache.cassandra.dht.Range.contains(Range.java:51)
>   at org.apache.cassandra.dht.Range.contains(Range.java:110)
>   at 
> org.apache.cassandra.locator.TokenMetadata.pendingEndpointsFor(TokenMetadata.java:916)
>   at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:775)
>   at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:541)
>   at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:616)
>   at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1101)
>   at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1083)
>   at 
> org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:976)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10712) Performance dramatically impacted by using client SSL

2015-11-16 Thread Andy Tolbert (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006983#comment-15006983
 ] 

Andy Tolbert commented on CASSANDRA-10712:
--

Apologies for the huge images (thumbnail functionality doesn't seem to be 
working).  If more detail or testing is needed, let me know and I'll be happy 
to provide more.

> Performance dramatically impacted by using client SSL
> -
>
> Key: CASSANDRA-10712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10712
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andy Tolbert
> Attachments: bench-graph.png, cpu_metrics.png, ssl-bench.tgz
>
>
> Throughput latency & throughput achieved via cassandra-stress is dramatically 
> impacted when using SSL (about 5-6x).
> !bench-graph.png!
> I haven't done much analysis of this yet, but one observation is that I do 
> notice a dramatic increase in context switches while running with SSL on the 
> server side.  In the charts below, the left-most data in a chart represents 
> running without SSL, and the right-most data represents running with SSL.  
> You'll observe that on the C* node that while the CPU utilization is down 
> when running stress with SSL, the context switches are higher.
> !cpu_metrics.png!
> Attached is [^ssl-bench.tgz] that includes output for each run, the 
> parameters used and an html file that is the output of running stress with 
> [CASSANDRA-7918].
> If you need some ready made keystore / truststore files to work with, some 
> can be found 
> [here|https://github.com/datastax/java-driver/tree/2.1/driver-core/src/test/resources].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-7217:
--
Attachment: stub_server.diff

It was easier to stub out the server first than to stub out the client library 
so I did that first. See attached diff. I have ExecuteMessage return a void 
result immediately.

On my setup with the server stubbed out the client node maxes out CPU at 400% 
(4 cores) and does 100k operations/second with 500 threads. I increased to 2000 
threads and utilization reported by top decreased to 270% (don't believe top, 
it's saturated) and throughput decreased to 30k.

At 1000 threads I still get 100k. I do see the drop at 1250 threads. So yes it 
exists, but it's might be an issue with the client library or how the client 
library chooses to present load to the server. I'll dig a bit into how the 
client library works to see if I can explain it.

I personally don't necessarily see this as a bug. If you want to concurrently 
execute more than 1000 requests you should not use thread per request on one 
node. That said we do have an interest in having it work as well as possible 
since people are going to do it anyways and we might as well pave the way 
modulo how much time we want to invest.

I am going to experiment with having stress use two (or N) instances of the 
client library to see if reduced contention in the client will ameliorate the 
drop off at 1250 threads. If that helps it may just be a matter of making sure 
the client library can operate as shared nothing shards internally so it can be 
made to have locality and scale up.

In the past I have found that a single global client instance with global locks 
doesn't scale, but I also had limited success with running multiple instances. 
It helps, but not to the point you get linear scale up.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10712) Performance dramatically impacted by using client SSL

2015-11-16 Thread Andy Tolbert (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-10712:
-
Description: 
Throughput latency & throughput achieved via cassandra-stress is dramatically 
impacted when using SSL (about 5-6x).

!bench-graph.png!

I haven't done much analysis of this yet, but one observation is that I do 
notice a dramatic increase in context switches while running with SSL on the 
server side.  In the charts below, the left-most data in a chart represents 
running without SSL, and the right-most data represents running with SSL.  
You'll observe that on the C* node that while the CPU utilization is down when 
running stress with SSL, the context switches are higher.

!cpu_metrics.png!

Attached is [^ssl-bench.tgz] that includes output for each run, the parameters 
used and an html file that is the output of running stress with 
[CASSANDRA-7918].

If you need some ready made keystore / truststore files to work with, some can 
be found 
[here|https://github.com/datastax/java-driver/tree/2.1/driver-core/src/test/resources].

  was:
Throughput latency & throughput achieved via cassandra-stress is dramatically 
impacted when using SSL (about 5-6x).

!bench-graph.png|thumbnail!

I haven't done much analysis of this yet, but one observation is that I do 
notice a dramatic increase in context switches while running with SSL on the 
server side.  In the charts below, the left-most data in a chart represents 
running without SSL, and the right-most data represents running with SSL.  
You'll observe that on the C* node that while the CPU utilization is down when 
running stress with SSL, the context switches are higher.

!cpu_metrics.png|thumbnail!

Attached is [^ssl-bench.tgz] that includes output for each run, the parameters 
used and an html file that is the output of running stress with 
[CASSANDRA-7918].

If you need some ready made keystore / truststore files to work with, some can 
be found 
[here|https://github.com/datastax/java-driver/tree/2.1/driver-core/src/test/resources].


> Performance dramatically impacted by using client SSL
> -
>
> Key: CASSANDRA-10712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10712
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andy Tolbert
> Attachments: bench-graph.png, cpu_metrics.png, ssl-bench.tgz
>
>
> Throughput latency & throughput achieved via cassandra-stress is dramatically 
> impacted when using SSL (about 5-6x).
> !bench-graph.png!
> I haven't done much analysis of this yet, but one observation is that I do 
> notice a dramatic increase in context switches while running with SSL on the 
> server side.  In the charts below, the left-most data in a chart represents 
> running without SSL, and the right-most data represents running with SSL.  
> You'll observe that on the C* node that while the CPU utilization is down 
> when running stress with SSL, the context switches are higher.
> !cpu_metrics.png!
> Attached is [^ssl-bench.tgz] that includes output for each run, the 
> parameters used and an html file that is the output of running stress with 
> [CASSANDRA-7918].
> If you need some ready made keystore / truststore files to work with, some 
> can be found 
> [here|https://github.com/datastax/java-driver/tree/2.1/driver-core/src/test/resources].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10712) Performance dramatically impacted by using client SSL

2015-11-16 Thread Andy Tolbert (JIRA)
Andy Tolbert created CASSANDRA-10712:


 Summary: Performance dramatically impacted by using client SSL
 Key: CASSANDRA-10712
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10712
 Project: Cassandra
  Issue Type: Bug
Reporter: Andy Tolbert
 Attachments: bench-graph.png, cpu_metrics.png, ssl-bench.tgz

Throughput latency & throughput achieved via cassandra-stress is dramatically 
impacted when using SSL (about 5-6x).

!bench-graph.png|thumbnail!

I haven't done much analysis of this yet, but one observation is that I do 
notice a dramatic increase in context switches while running with SSL on the 
server side.  In the charts below, the left-most data in a chart represents 
running without SSL, and the right-most data represents running with SSL.  
You'll observe that on the C* node that while the CPU utilization is down when 
running stress with SSL, the context switches are higher.

!cpu_metrics.png|thumbnail!

Attached is [^ssl-bench.tgz] that includes output for each run, the parameters 
used and an html file that is the output of running stress with 
[CASSANDRA-7918].

If you need some ready made keystore / truststore files to work with, some can 
be found 
[here|https://github.com/datastax/java-driver/tree/2.1/driver-core/src/test/resources].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10688) Stack overflow from SSTableReader$InstanceTidier.runOnClose in Leak Detector

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10688:
--
Fix Version/s: 3.1

> Stack overflow from SSTableReader$InstanceTidier.runOnClose in Leak Detector
> 
>
> Key: CASSANDRA-10688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10688
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
> Fix For: 3.0.1, 3.1
>
>
> Running some tests against cassandra-3.0 
> 9fc957cf3097e54ccd72e51b2d0650dc3e83eae0
> The tests are just running cassandra-stress write and read while adding and 
> removing nodes from the cluster.  After the test runs when I go back through 
> logs I find the following Stackoverflow fairly often:
> ERROR [Strong-Reference-Leak-Detector:1] 2015-11-11 00:04:10,638  
> Ref.java:413 - Stackoverflow [private java.lang.Runnable 
> org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.runOnClose,
>  final java.lang.Runnable 
> org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache.andThen, 
> final org.apache.cassandra.cache.InstrumentingCache 
> org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys.cache, private 
> final org.apache.cassandra.cache.ICache 
> org.apache.cassandra.cache.InstrumentingCache.map, private final 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap 
> org.apache.cassandra.cache.ConcurrentLinkedHashCache.map, final 
> com.googlecode.concurrentlinkedhashmap.LinkedDeque 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.evictionDeque, 
> com.googlecode.concurrentlinkedhashmap.Linked 
> com.googlecode.concurrentlinkedhashmap.LinkedDeque.first, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> ... (repeated a whole bunch more)  
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next, 
> final java.lang.Object 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.key, 
> public final byte[] org.apache.cassandra.cache.KeyCacheKey.key



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10642) cqlsh COPY bulk round trip dtest flaps on Windows

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10642:
--
Fix Version/s: 3.0.1

> cqlsh COPY bulk round trip dtest flaps on Windows
> -
>
> Key: CASSANDRA-10642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10642
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Stefania
> Fix For: 3.0.1, 3.1
>
>
> {{cqlsh_tests/cqlsh_copy_tests.py:CqlshCopyTest.test_bulk_round_trip}} flaps 
> on Windows under both C* 3.0 and 2.2:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/junit/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip/history/
> http://cassci.datastax.com/view/win32/job/cassandra-2.2_dtest_win32/127/testReport/junit/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip/
> It fails because, after round-tripping with cqlsh COPY, it fails to find as 
> many values as expected with {{SELECT COUNT (\*)}}. The stderr from cqlsh 
> includes the following error:
> {code}
> (EE)  (EE)  :2:(EE)   out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses.">(EE)  :2:Aborting import at record #190. 
> Previously inserted records are still present, and some records after that 
> may be present as well.(EE)  :2:(EE)   [Unavailable exception] message="Cannot achieve consistency level ONE">(EE)  
> :2:Aborting import at record #637. Previously inserted records are 
> still present, and some records after that may be present as well.(EE)  
> {code}
> So, it looks like the load step may be timing out. See this run for the error:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/junit/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10644) multiple repair dtest fails under Windows

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10644:
--
Fix Version/s: 3.0.1

> multiple repair dtest fails under Windows
> -
>
> Key: CASSANDRA-10644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10644
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Paulo Motta
> Fix For: 2.2.4, 3.0.1, 3.1
>
>
> {{incremental_repair_test.py:TestIncRepair.multiple_repair_test}} flaps on 
> CassCI Windows runs on C* 3.0:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/junit/incremental_repair_test/TestIncRepair/multiple_repair_test/history/
> The error is {{An existing connection was forcibly closed by the remote 
> host}}, and happens consistently in the failing runs:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/junit/incremental_repair_test/TestIncRepair/multiple_repair_test/
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/72/testReport/junit/incremental_repair_test/TestIncRepair/multiple_repair_test/
> [~yukim] Can you have a look? I feel like you're more likely than anyone else 
> to understand the streaming error. In particular: is this what happens when a 
> node goes down? This could be an environment error, rather than a C* bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10647) manual 2i rebuilding dtest failure on Windows

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10647:
--
Fix Version/s: (was: 3.1)

> manual 2i rebuilding dtest failure on Windows
> -
>
> Key: CASSANDRA-10647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10647
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Sam Tunnicliffe
>
> {{secondary_indexes_test.py:TestSecondaryIndexes.test_manual_rebuild_index}} 
> failed once on CassCI running C* 3.0 under Windows:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/101/testReport/secondary_indexes_test/TestSecondaryIndexes/test_manual_rebuild_index/history/
> It fails here:
> https://github.com/riptano/cassandra-dtest/blob/master/secondary_indexes_test.py#L294
> with the error {{1 != 0}}. [~beobal] IIRC you understand these tests. Any 
> idea why it'd flap like this? or could this be a genuine regression?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10646) crash_during_decommission_test dtest fails on windows

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10646:
--
Fix Version/s: 3.0.1

> crash_during_decommission_test dtest fails on windows
> -
>
> Key: CASSANDRA-10646
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10646
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Paulo Motta
> Fix For: 3.0.1, 3.1
>
>
> {{topology_test.py:TestTopology.crash_during_decommission_test}} flaps on on 
> C* 3.0 on Windows:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/topology_test/TestTopology/crash_during_decommission_test/history/
> Since this test raises 2 errors on failure, there are 2 histories on CassCI 
> for it:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/topology_test/TestTopology/crash_during_decommission_test_2/history/
> It looks like it fails because of contention over the temporary file where 
> {{cassandra.env}} is stored:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/101/testReport/junit/topology_test/TestTopology/crash_during_decommission_test/
> Looks like this happens when {{nodetool status}} is called, since 
> {{nodetool}} sources {{cassandra-env.sh}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10674) Materialized View SSTable streaming/leaving status race on decommission

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10674:
--
Fix Version/s: 3.0.1

> Materialized View SSTable streaming/leaving status race on decommission
> ---
>
> Key: CASSANDRA-10674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination, Distributed Metadata
>Reporter: Joel Knighton
> Fix For: 3.0.1, 3.1
>
> Attachments: leaving-node-debug.log, receiving-node-debug.log
>
>
> On decommission of a node in a cluster with materialized views, it is 
> possible for the decommissioning node to begin streaming sstables for an MV 
> base table before the receiving node is aware of the leaving status.
> The materialized view base/view replica pairing checks pending endpoints to 
> handle the case when an sstable is received from a leaving node; without the 
> leaving message, this check breaks and an exception is thrown. The streamed 
> sstable is never applied.
> Logs from a decommissioning node and a node receiving such a stream are 
> attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10664) Fix failing tests for 3.1

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10664:
--
Fix Version/s: 3.0.1

> Fix failing tests for 3.1
> -
>
> Key: CASSANDRA-10664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10664
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sylvain Lebresne
> Fix For: 3.0.1, 3.1
>
>
> This is the continuation of CASSANDRA-10166, just a meta-ticket to group all 
> tickets related to fixing any unit test or dtest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10680) Deal with small compression chunk size better during streaming plan setup

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10680:
--
Fix Version/s: 3.0.1

> Deal with small compression chunk size better during streaming plan setup
> -
>
> Key: CASSANDRA-10680
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10680
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Jirsa
>Assignee: Yuki Morishita
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> For clusters using small compression chunk size and terabytes of data, the 
> streaming plan calculations will instantiate hundreds of millions of 
> compressionmetadata$chunk objects, which will create unreasonable amounts of 
> heap pressure. Rather than instantiating all of those at once, streaming 
> should instantiate only as many as needed for a single file per table at a 
> time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10666) jmx_test.TestJMX.test_compactionstats is flapping

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10666:
--
Fix Version/s: 3.0.1

> jmx_test.TestJMX.test_compactionstats is flapping
> -
>
> Key: CASSANDRA-10666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10666
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Sylvain Lebresne
>Assignee: Jim Witschey
> Fix For: 3.0.1, 3.1
>
>
> See the [history for that 
> test|http://cassci.datastax.com/job/cassandra-3.0_dtest/335/testReport/junit/jmx_test/TestJMX/test_compactionstats/history/].
>  On each failure there is something about a problem with {{jolokia}} so 
> that's probably a test environment problem. Still needs to be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10668) bootstrap_test.TestBootstrap.resumable_bootstrap_test is failing

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10668:
--
Fix Version/s: 3.0.1

> bootstrap_test.TestBootstrap.resumable_bootstrap_test is failing
> 
>
> Key: CASSANDRA-10668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10668
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Sylvain Lebresne
>Assignee: Yuki Morishita
> Fix For: 3.0.1, 3.1
>
>
> From the [test 
> history|http://cassci.datastax.com/job/cassandra-3.0_dtest/335/testReport/junit/bootstrap_test/TestBootstrap/resumable_bootstrap_test/history/],
>  it seems the test has been flappy for a while, but it's been constantly 
> failing for the last few builts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10696) Audit jmx_test.py

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10696:
--
Fix Version/s: 3.0.1

> Audit jmx_test.py
> -
>
> Key: CASSANDRA-10696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10696
> Project: Cassandra
>  Issue Type: Test
>Reporter: Philip Thompson
>Assignee: Jim Witschey
> Fix For: 3.0.1, 3.1
>
>
> It seems that some/many of the jmx tests are not effectively testing the 
> tickets they're targeted for. Someone should go through these and refactor 
> them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10704) remove cobertura from build file

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10704:
--
Fix Version/s: 3.0.1

> remove cobertura from build file
> 
>
> Key: CASSANDRA-10704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10704
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Russ Hatch
>Assignee: Russ Hatch
>Priority: Minor
> Fix For: 3.0.1, 3.1
>
> Attachments: trunk-10704.txt
>
>
> Since the project has adopted Jacoco, I don't believe the cobertura tasks are 
> in use any longer, and it's not certain if they still function. I don't think 
> there's any benefit from trying to keep both coverage tools working, and also 
> have the impression that cobertura development has slowed (or been slow to 
> support new versions of java). Jacoco's other advantage is it's simpler usage 
> (via a java agent), as compared to cobertura's offline code instrumentation 
> requiring more complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10711) NoSuchElementException when executing empty batch.

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10711:
--
Fix Version/s: 3.0.1

> NoSuchElementException when executing empty batch.
> --
>
> Key: CASSANDRA-10711
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10711
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0, OSS 42.1
>Reporter: Jaroslav Kamenik
> Fix For: 3.0.1, 3.1
>
>
> After upgrade to C* 3.0, it fails when executes empty batch:
> java.util.NoSuchElementException: null
> at java.util.ArrayList$Itr.next(ArrayList.java:854) ~[na:1.8.0_60]
> at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:737)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:356)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:337)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:323)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:490)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:480)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
>  ~[apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_60]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.0.jar:3.0.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10640) hadoop splits are calculated incorrectly

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10640:
--
Summary: hadoop splits are calculated incorrectly  (was: hadoop splits are 
calculated wrong)

> hadoop splits are calculated incorrectly
> 
>
> Key: CASSANDRA-10640
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10640
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Liu
>Assignee: Aleksey Yeschenko
> Fix For: 2.2.4, 3.0.1, 3.1
>
> Attachments: 10640.txt
>
>
> A typo at line 
> https://github.com/apache/cassandra/blob/cassandra-2.2/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java#L216
> where getEnd should be used



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10541) cqlshlib tests cannot run on Windows

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-10541:

 Labels: cqlsh windows  (was: cqlsh)
Component/s: Tools

> cqlshlib tests cannot run on Windows
> 
>
> Key: CASSANDRA-10541
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10541
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Benjamin Lerer
>Assignee: Paulo Motta
>Priority: Minor
>  Labels: cqlsh, windows
>
> If I try to run the {{cqlshlib}} tests on Windows, I got the following error:
> {quote}
> ==
> ERROR: Failure: AttributeError ('module' object has no attribute 'symlink')
> --
> Traceback (most recent call last):
>   File "C:\Python27\lib\site-packages\nose\loader.py", line 414, in 
> loadTestsFromName
> addr.filename, addr.module)
>   File "C:\Python27\lib\site-packages\nose\importer.py", line 47, in 
> importFromPath
> return self.importFromDir(dir_path, fqname)
>   File "C:\Python27\lib\site-packages\nose\importer.py", line 94, in 
> importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File "[...]\pylib\cqlshlib\test\__init__.py", line 17, in 
> from .cassconnect import create_test_db, remove_test_db
>   File "[...]\pylib\cqlshlib\test\cassconnect.py", line 22, in 
> from .basecase import cql, cqlsh, cqlshlog, TEST_HOST, TEST_PORT, rundir
>   File "[...]\pylib\cqlshlib\test\basecase.py", line 43, in 
> os.symlink(path_to_cqlsh, modulepath)
> AttributeError: 'module' object has no attribute 'symlink'
> --
> Ran 1 test in 0.002s
> FAILED (errors=1)
> {quote}
> The problem comes from the fact tha Windows has no support for symlinks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10494) Move static JVM options to jvm.options file

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-10494:

Component/s: Configuration

> Move static JVM options to jvm.options file
> ---
>
> Key: CASSANDRA-10494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10494
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.2
>
>
> CASSANDRA-10403 moved gc and heap options to conf/jvm.options file. Move 
> remaining static JVM options from cassandra-env to jvm.options file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10632) sstableutil tests failing

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10632:
--
Fix Version/s: 3.0.1

> sstableutil tests failing
> -
>
> Key: CASSANDRA-10632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10632
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Jim Witschey
> Fix For: 3.0.1, 3.1
>
>
> {{sstableutil_test.py:SSTableUtilTest.abortedcompaction_test}} and 
> {{sstableutil_test.py:SSTableUtilTest.compaction_test}} fail on Windows:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/sstableutil_test/SSTableUtilTest/abortedcompaction_test/
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/sstableutil_test/SSTableUtilTest/compaction_test/
> This is a pretty simple failure -- looks like the underlying behavior is ok, 
> but string comparison fails when the leading {{d}} in the filename is 
> lowercase as returned by {{sstableutil}} (see the [{{_invoke_sstableutil}} 
> test 
> function|https://github.com/riptano/cassandra-dtest/blob/master/sstableutil_test.py#L128]),
>  but uppercase as returned by {{glob.glob}} (see the [{{_get_sstable_files}} 
> test 
> function|https://github.com/riptano/cassandra-dtest/blob/master/sstableutil_test.py#L160]).
> Do I understand correctly that Windows filenames are case-insensitive, 
> including the drive portion? If that's the case, then we can just lowercase 
> the file names in the test helper functions above when the tests are run on 
> Windows. [~JoshuaMcKenzie] can you confirm? I'll fix this in the tests if so. 
> If I'm wrong, and something in {{sstableutil}} needs to be fixed, could you 
> find an assignee?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10612) Protocol v3 upgrade tests on 2.1->3.0 path fail when compaction is interrupted

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10612:
--
Fix Version/s: 3.0.1

> Protocol v3 upgrade tests on 2.1->3.0 path fail when compaction is interrupted
> --
>
> Key: CASSANDRA-10612
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10612
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Russ Hatch
> Fix For: 3.0.1, 3.1
>
>
> The following tests in the upgrade_through_versions dtest suite fail:
> * 
> upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.rolling_upgrade_test
> * 
> upgrade_through_versions_test.py:TestRandomPartitionerUpgrade.rolling_upgrade_with_internode_ssl_test
> * 
> upgrade_through_versions_test.py:TestUpgradeThroughVersions.rolling_upgrade_with_internode_ssl_test
> * 
> upgrade_through_versions_test.py:TestUpgradeThroughVersions.rolling_upgrade_test
> * 
> upgrade_through_versions_test.py:TestUpgrade_from_2_1_latest_tag_to_cassandra_3_0_HEAD.rolling_upgrade_test
> * 
> upgrade_through_versions_test.py:TestUpgrade_from_2_1_latest_tag_to_cassandra_3_0_HEAD.rolling_upgrade_with_internode_ssl_test
> * 
> upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_latest_tag.rolling_upgrade_test
> * 
> upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_latest_tag.rolling_upgrade_with_internode_ssl_test
> * 
> upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_HEAD.rolling_upgrade_test
> See this report:
> http://cassci.datastax.com/view/Upgrades/job/cassandra_upgrade_2.1_to_3.0_proto_v3/10/testReport/
> They fail with the following error:
> {code}
> A subprocess has terminated early. Subprocess statuses: Process-41 (is_alive: 
> True), Process-42 (is_alive: False), Process-43 (is_alive: True), Process-44 
> (is_alive: False), attempting to terminate remaining subprocesses now.
> {code}
> and with logs that look like this:
> {code}
> Unexpected error in node1 node log: ['ERROR [SecondaryIndexManagement:1] 
> 2015-10-27 00:06:52,335 CassandraDaemon.java:195 - Exception in thread 
> Thread[SecondaryIndexManagement:1,5,main] java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.db.compaction.CompactionInterruptedException: Compaction 
> interrupted: Secondary index 
> build@41202370-7c3e-11e5-9331-6bb6e58f8b1b(upgrade, cf, 578160/1663620)bytes
> at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:368) 
> ~[main/:na]
> at 
> org.apache.cassandra.index.internal.CassandraIndex.buildBlocking(CassandraIndex.java:688)
>  ~[main/:na]
> at 
> org.apache.cassandra.index.internal.CassandraIndex.lambda$getBuildIndexTask$206(CassandraIndex.java:658)
>  ~[main/:na]
> at 
> org.apache.cassandra.index.internal.CassandraIndex$$Lambda$151/1841229245.call(Unknown
>  Source) ~[na:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_51]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51] Caused by: 
> java.util.concurrent.ExecutionException: 
> org.apache.cassandra.db.compaction.CompactionInterruptedException: Compaction 
> interrupted: Secondary index 
> build@41202370-7c3e-11e5-9331-6bb6e58f8b1b(upgrade, cf, 
> 578160/{code}1663620)bytes
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[na:1.8.0_51]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[na:1.8.0_51]
> at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:364) 
> ~[main/:na]
> ... 7 common frames omitted Caused by: 
> org.apache.cassandra.db.compaction.CompactionInterruptedException: Compaction 
> interrupted: Secondary index 
> build@41202370-7c3e-11e5-9331-6bb6e58f8b1b(upgrade, cf, 578160/1663620)bytes
> at 
> org.apache.cassandra.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:67)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1269)
>  ~[main/:na]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_51]
> ... 4 common frames omitted', 'ERROR [HintsDispatcher:2] 2015-10-27 
> 00:08:48,520 CassandraDaemon.java:195 - Exception in thread 
> Thread[HintsDispatcher:2,1,main]', 'ERROR [HintsDispatcher:2] 2015-10-27 
> 00:11:58,336 CassandraDaemon.java:195 - Exception in thread 
> Thread[HintsDispatcher:2,1,main]']
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10613) Upgrade test on 2.1->3.0 path fails with NPE in getExistingFiles (likely known bug)

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10613:
--
Fix Version/s: 3.0.1

> Upgrade test on 2.1->3.0 path fails with NPE in getExistingFiles (likely 
> known bug)
> ---
>
> Key: CASSANDRA-10613
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10613
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Russ Hatch
> Fix For: 3.0.1, 3.1
>
>
> In this job:
> http://cassci.datastax.com/view/Upgrades/job/cassandra_upgrade_2.1_to_3.0_proto_v3/10/
> The following tests fail due to an NPE in 
> {{org.apache.cassandra.db.lifecycle.LogRecord.getExistingFiles}}:
> upgrade_through_versions_test.py:TestUpgrade_from_3_0_latest_tag_to_3_0_HEAD.bootstrap_test
> upgrade_through_versions_test.py:TestUpgrade_from_3_0_latest_tag_to_3_0_HEAD.rolling_upgrade_test
> upgrade_through_versions_test.py:TestUpgrade_from_3_0_latest_tag_to_3_0_HEAD.parallel_upgrade_with_internode_ssl_test
> upgrade_through_versions_test.py:TestUpgrade_from_3_0_latest_tag_to_3_0_HEAD.rolling_upgrade_with_internode_ssl_test
> upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_HEAD.rolling_upgrade_with_internode_ssl_test
> upgrade_through_versions_test.py:TestUpgrade_from_3_0_latest_tag_to_3_0_HEAD.parallel_upgrade_test
> I believe this is likely happening because of CASSANDRA-10602, so let's hold 
> off on messing with this until that's merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10639) Commitlog compression test fails on Windows

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10639:
--
Fix Version/s: 3.0.1

> Commitlog compression test fails on Windows
> ---
>
> Key: CASSANDRA-10639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10639
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Jim Witschey
>Assignee: Joshua McKenzie
> Fix For: 3.0.1, 3.1
>
>
> {{commitlog_test.py:TestCommitLog.test_compression_error}} fails on Windows 
> under CassCI. It fails in a number of different ways. Here, it looks like 
> reading the CRC fails:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/100/testReport/commitlog_test/TestCommitLog/test_compression_error/
> Here, I believe it fails when trying to validate the CRC header:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/99/testReport/commitlog_test/TestCommitLog/test_compression_error/
> https://github.com/riptano/cassandra-dtest/blob/master/commitlog_test.py#L497
> Here's another failure where the header has a {{Q}} written in it instead of 
> a closing brace:
> http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/91/testReport/junit/commitlog_test/TestCommitLog/test_compression_error/
> https://github.com/riptano/cassandra-dtest/blob/master/commitlog_test.py#L513
> [~bdeggleston] Do I remember correctly that you wrote this test? Can you take 
> this on?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10592) IllegalArgumentException in DataOutputBuffer.reallocate

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10592:
--
Fix Version/s: 3.0.1

> IllegalArgumentException in DataOutputBuffer.reallocate
> ---
>
> Key: CASSANDRA-10592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10592
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Local Write-Read Paths, Streaming and 
> Messaging
>Reporter: Sebastian Estevez
>Assignee: Ariel Weisberg
> Fix For: 3.0.1, 3.1, 2.2.x
>
>
> CORRECTION-
> It turns out the exception occurs when running a read using a thrift jdbc 
> driver. Once you have loaded the data with stress below, run 
> SELECT * FROM "autogeneratedtest"."transaction_by_retailer" using this tool - 
> http://www.aquafold.com/aquadatastudio_downloads.html
>  
> The exception:
> {code}
> WARN  [SharedPool-Worker-1] 2015-10-22 12:58:20,792 
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.RuntimeException: java.lang.IllegalArgumentException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2366)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: java.lang.IllegalArgumentException: null
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:334) ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:63)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:57)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:374)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.BufferCell$Serializer.serialize(BufferCell.java:263)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:183)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:108)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:381)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:136)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:128)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1697)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2362)
>  ~[main/:na]
>   ... 4 common frames omitted
> {code}
> I was running this command:
> {code}
> tools/bin/cassandra-stress user 
> profile=~/Desktop/startup/stress/stress.yaml n=10 ops\(insert=1\) -rate 
> threads=30
> {code}
> Here's the stress.yaml UPDATED!
> {code}
> ### DML ### THIS IS UNDER CONSTRUCTION!!!
> # Keyspace Name
> keyspace: autogeneratedtest
> # The CQL for creating a keyspace (optional if it already exists)
> keyspace_definition: |
>   CREATE KEYSPACE autogeneratedtest

[jira] [Updated] (CASSANDRA-10610) Protocol upgrade tests for bootstrapping failing on 2.1->3.0 path

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10610:
--
Fix Version/s: 3.0.1

> Protocol upgrade tests for bootstrapping failing on 2.1->3.0 path
> -
>
> Key: CASSANDRA-10610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10610
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Russ Hatch
> Fix For: 3.0.1, 3.1
>
>
> This ticket is for a couple failures on protocol v3 upgrade test job on 
> CassCI:
> http://cassci.datastax.com/view/Upgrades/job/cassandra_upgrade_2.1_to_3.0_proto_v3/10/testReport/
> The failures are:
> * 
> [upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_HEAD.bootstrap_multidc_test|http://cassci.datastax.com/view/Upgrades/job/cassandra_upgrade_2.1_to_3.0_proto_v3/10/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_HEAD/bootstrap_multidc_test/]
> * 
> [upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_latest_tag.bootstrap_multidc_test|http://cassci.datastax.com/view/Upgrades/job/cassandra_upgrade_2.1_to_3.0_proto_v3/10/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_latest_tag/bootstrap_multidc_test/]
> They fail with the following error:
> {code}
> code=1000 [Unavailable exception] message="Cannot achieve consistency level 
> ALL" info={'required_replicas': 3, 'alive_replicas': 0, 'consistency': 'ALL'}
> {code}
> Assigning [~rhatch], since you're the most likely to understand what's going 
> on here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10625) Problem of year 10000: Dates too far in the future can be saved but not read back using cqlsh

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10625:
--
Fix Version/s: 3.0.1

> Problem of year 1: Dates too far in the future can be saved but not read 
> back using cqlsh
> -
>
> Key: CASSANDRA-10625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10625
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Piotr Kołaczkowski
>Priority: Minor
> Fix For: 3.0.1, 3.1
>
>
> {noformat}
> cqlsh> insert into test.timestamp_test (pkey, ts) VALUES (1, '-12-31 
> 23:59:59+');
> cqlsh> select * from test.timestamp_test ;
>  pkey | ts
> --+--
> 1 | -12-31 23:59:59+
> (1 rows)
> cqlsh> insert into test.timestamp_test (pkey, ts) VALUES (1, '1-01-01 
> 00:00:01+');
> cqlsh> select * from test.timestamp_test ;
> Traceback (most recent call last):
>   File "bin/../resources/cassandra/bin/cqlsh", line 1112, in 
> perform_simple_statement
> rows = self.session.execute(statement, trace=self.tracing_enabled)
>   File 
> "/home/pkolaczk/Projekty/DataStax/bdp/resources/cassandra/bin/../zipfiles/cassandra-driver-internal-only-2.7.2.zip/cassandra-driver-2.7.2/cassandra/cluster.py",
>  line 1602, in execute
> result = future.result()
>   File 
> "/home/pkolaczk/Projekty/DataStax/bdp/resources/cassandra/bin/../zipfiles/cassandra-driver-internal-only-2.7.2.zip/cassandra-driver-2.7.2/cassandra/cluster.py",
>  line 3347, in result
> raise self._final_exception
> OverflowError: date value out of range
> {noformat}
> The connection is broken afterwards:
> {noformat}
> cqlsh> insert into test.timestamp_test (pkey, ts) VALUES (1, '1-01-01 
> 00:00:01+');
> NoHostAvailable: ('Unable to complete the operation against any hosts', 
> {: ConnectionShutdown('Connection to 127.0.0.1 is 
> defunct',)})
> {noformat}
> Expected behaviors (one of):
> - don't allow to insert dates larger than -12-31 and document the 
> limitation
> - handle all dates up to Java Date(MAX_LONG) for writing and reading



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10563) Integrate new upgrade test into dtest upgrade suite

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10563:
--
Fix Version/s: 3.0.1

> Integrate new upgrade test into dtest upgrade suite
> ---
>
> Key: CASSANDRA-10563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Jim Witschey
>Assignee: Jim Witschey
> Fix For: 3.0.1, 3.1
>
>
> This is a follow-up ticket for CASSANDRA-10360, specifically [~slebresne]'s 
> comment here:
> https://issues.apache.org/jira/browse/CASSANDRA-10360?focusedCommentId=14966539&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14966539
> These tests should be incorporated into the [{{upgrade_tests}} in 
> dtest|https://github.com/riptano/cassandra-dtest/tree/master/upgrade_tests]. 
> I'll take this on; [~nutbunnies] is also a good person for it, but I'll 
> likely get to it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10515) Commit logs back up with move to 2.1.10

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10515:
--
Fix Version/s: 3.0.1

> Commit logs back up with move to 2.1.10
> ---
>
> Key: CASSANDRA-10515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10515
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: redhat 6.5, cassandra 2.1.10
>Reporter: Jeff Griffith
>Assignee: Branimir Lambov
>  Labels: commitlog, triage
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
> Attachments: C5commitLogIncrease.jpg, CASSANDRA-19579.jpg, 
> CommitLogProblem.jpg, CommitLogSize.jpg, 
> MultinodeCommitLogGrowth-node1.tar.gz, RUN3tpstats.jpg, cassandra.yaml, 
> cfstats-clean.txt, stacktrace.txt, system.log.clean
>
>
> After upgrading from cassandra 2.0.x to 2.1.10, we began seeing problems 
> where some nodes break the 12G commit log max we configured and go as high as 
> 65G or more before it restarts. Once it reaches the state of more than 12G 
> commit log files, "nodetool compactionstats" hangs. Eventually C* restarts 
> without errors (not sure yet whether it is crashing but I'm checking into it) 
> and the cleanup occurs and the commit logs shrink back down again. Here is 
> the nodetool compactionstats immediately after restart.
> {code}
> jgriffith@prod1xc1.c2.bf1:~$ ndc
> pending tasks: 2185
>compaction type   keyspace  table completed
>   totalunit   progress
> Compaction   SyncCore  *cf1*   61251208033   
> 170643574558   bytes 35.89%
> Compaction   SyncCore  *cf2*   19262483904
> 19266079916   bytes 99.98%
> Compaction   SyncCore  *cf3*6592197093
>  6592316682   bytes100.00%
> Compaction   SyncCore  *cf4*3411039555
>  3411039557   bytes100.00%
> Compaction   SyncCore  *cf5*2879241009
>  2879487621   bytes 99.99%
> Compaction   SyncCore  *cf6*   21252493623
> 21252635196   bytes100.00%
> Compaction   SyncCore  *cf7*   81009853587
> 81009854438   bytes100.00%
> Compaction   SyncCore  *cf8*3005734580
>  3005768582   bytes100.00%
> Active compaction remaining time :n/a
> {code}
> I was also doing periodic "nodetool tpstats" which were working but not being 
> logged in system.log on the StatusLogger thread until after the compaction 
> started working again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10407) Benchmark and evaluate CASSANDRA-8894 improvements

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10407:
--
Fix Version/s: 3.0.1

> Benchmark and evaluate CASSANDRA-8894 improvements
> --
>
> Key: CASSANDRA-10407
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10407
> Project: Cassandra
>  Issue Type: Test
>Reporter: Aleksey Yeschenko
>Assignee: Alan Boudreault
> Fix For: 3.0.1, 3.1
>
> Attachments: 3_0_head_mmap_wo_ra.nps, 3_0_head_std_wo_ra.nps, 
> 8894_tiny.yaml, allocateDirect.png, flight-recordings.tar.gz, 
> reBufferStandardTime.png, size.png, test-with-8894-tiny.json, 
> test-without-8894-tiny.json
>
>
> The original ticket (CASSANDRA-8894) was committed to 3.0 alpha1 two months 
> ago. We need to get proper performance tests before GA.
> See [~benedict]'s 
> [comment|https://issues.apache.org/jira/browse/CASSANDRA-8894?focusedCommentId=14631203&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14631203]
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10422) Avoid anticompaction when doing subrange repair

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10422:
--
Fix Version/s: 3.0.1

> Avoid anticompaction when doing subrange repair
> ---
>
> Key: CASSANDRA-10422
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10422
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Marcus Eriksson
>Assignee: Ariel Weisberg
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> If we do split the owned range in say 1000 parts, and then do one repair 
> each, we could potentially anticompact every sstable 1000 times (ie, we 
> anticompact the repaired range out 1000 times). We should avoid 
> anticompacting at all in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10374) List and Map values incorrectly limited to 64k size

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10374:
--
Fix Version/s: 3.0.1

> List and Map values incorrectly limited to 64k size
> ---
>
> Key: CASSANDRA-10374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tyler Hobbs
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> With the v3 native protocol, we switched from encoding collection element 
> sizes with shorts to ints.  However, in {{Lists.java}} and {{Maps.java}}, we 
> still validate that list and map values are smaller than 
> {{MAX_UNSIGNED_SHORT}}.
> Map keys and set elements are stored in the cell name, so they're implicitly 
> limited to the cell name size limit of 64k.  However, for non-frozen 
> collections, this limitation should not apply, so we probably don't want to 
> perform this check here for those either.
> The fix should include tests where we exceed the 64k limit for frozen and 
> non-frozen collections.  In the case of non-frozen lists and maps, we should 
> verify that the 64k cell-name size limit is enforced in a friendly way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10512) We do not save an upsampled index summaries

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10512:
--
Fix Version/s: 3.0.1

> We do not save an upsampled index summaries
> ---
>
> Key: CASSANDRA-10512
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10512
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Assignee: Ariel Weisberg
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> If we downsample an index summary, we overwrite the existing summary, despite 
> downsampling being inexpensive. However on upsampling (which is expensive) we 
> do not, so that on restart all of our index summaries are the smallest they 
> have ever been adjusted to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10476) Fix upgrade paging dtest failures on 2.2->3.0 path

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10476:
--
Fix Version/s: 3.0.1

> Fix upgrade paging dtest failures on 2.2->3.0 path
> --
>
> Key: CASSANDRA-10476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10476
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Benjamin Lerer
> Fix For: 3.0.1, 3.1
>
>
> EDIT: this list of failures is no longer current; see comments for current 
> failures.
> The following upgrade tests for paging features fail or flap on the upgrade 
> path from 2.2 to 3.0:
> - {{upgrade_tests/paging_test.py:TestPagingData.static_columns_paging_test}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingSize.test_undefined_page_size_default}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingSize.test_with_more_results_than_page_size}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingWithDeletions.test_multiple_cell_deletions}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingWithDeletions.test_single_cell_deletions}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingWithDeletions.test_single_row_deletions}}
> - 
> {{upgrade_tests/paging_test.py:TestPagingDatasetChanges.test_cell_TTL_expiry_during_paging/}}
> I've grouped them all together because I don't know how to tell if they're 
> related; once someone triages them, it may be appropriate to break this out 
> into multiple tickets.
> The failures can be found here:
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingData/static_columns_paging_test/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingSize/test_undefined_page_size_default/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/42/testReport/upgrade_tests.paging_test/TestPagingSize/test_with_more_results_than_page_size/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingWithDeletions/test_multiple_cell_deletions/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingWithDeletions/test_single_cell_deletions/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingWithDeletions/test_single_row_deletions/history/
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/44/testReport/upgrade_tests.paging_test/TestPagingDatasetChanges/test_cell_TTL_expiry_during_paging/
> Once [this dtest PR|https://github.com/riptano/cassandra-dtest/pull/586] is 
> merged, these tests should also run with this upgrade path on normal 3.0 
> jobs. Until then, you can run them with the following command:
> {code}
> SKIP=false CASSANDRA_VERSION=binary:2.2.0 UPGRADE_TO=git:cassandra-3.0 
> nosetests 
> upgrade_tests/paging_test.py:TestPagingData.static_columns_paging_test 
> upgrade_tests/paging_test.py:TestPagingSize.test_undefined_page_size_default 
> upgrade_tests/paging_test.py:TestPagingSize.test_with_more_results_than_page_size
>  
> upgrade_tests/paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions
>  
> upgrade_tests/paging_test.py:TestPagingWithDeletions.test_multiple_cell_deletions
>  
> upgrade_tests/paging_test.py:TestPagingWithDeletions.test_single_cell_deletions
>  
> upgrade_tests/paging_test.py:TestPagingWithDeletions.test_single_row_deletions
> upgrade_tests/paging_test.py:TestPagingDatasetChanges.test_cell_TTL_expiry_during_paging
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10340) Stress should exit with non-zero status after failure

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-10340:

   Assignee: (was: Paulo Motta)
 Labels: lhf stress  (was: stress)
Component/s: Tools

> Stress should exit with non-zero status after failure
> -
>
> Key: CASSANDRA-10340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10340
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Paulo Motta
>Priority: Minor
>  Labels: lhf, stress
>
> Currently, stress always exits with sucess status, even if after a failure. 
> In order to be able to rely on stress exit status during dtests it would be 
> nice if it exited with a non-zero status after failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9556) Add newer data types to cassandra stress (e.g. decimal, dates, UDTs)

2015-11-16 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006931#comment-15006931
 ] 

Robert Stupp commented on CASSANDRA-9556:
-

FTR The "mismatch" between Cassandra's documented "wire transport" 
representation of DATE as a 32 bit int and the Java driver's (and Java 
agnostic) representation as a LocalDate is intentional.

> Add newer data types to cassandra stress (e.g. decimal, dates, UDTs)
> 
>
> Key: CASSANDRA-9556
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9556
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: stress
>
> Currently you can't define a data model with decimal types and use Cassandra 
> stress with it.  Also, I imagine that holds true with other newer data types 
> such as the new date and time types.  Besides that, now that data models are 
> including user defined types, we should allow users to create those 
> structures with stress as well.  Perhaps we could split out the UDTs into a 
> different ticket if it holds the other types up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10341) Streaming does not guarantee cache invalidation

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10341:
--
Fix Version/s: 3.0.1

> Streaming does not guarantee cache invalidation
> ---
>
> Key: CASSANDRA-10341
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10341
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Paulo Motta
> Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1
>
>
> Looking at the code, we attempt to invalidate the row cache for any rows we 
> receive via streaming, however we invalidate them immediately, before the new 
> data is available. So, if it is requested (which is likely if it is "hot") in 
> the interval, it will be re-cached and not invalidated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10350) cqlsh describe keyspace output no longers keeps indexes in sorted order

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10350:
--
Fix Version/s: 3.0.1

> cqlsh describe keyspace output no longers keeps indexes in sorted order
> ---
>
> Key: CASSANDRA-10350
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10350
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Andrew Hust
>Priority: Minor
>  Labels: cqlsh
> Fix For: 3.0.1, 3.1
>
>
> cqlsh command {{describe keyspace }} no longer keeps indexes in alpha 
> sorted order.  This was caught with a dtest on 
> [cassci|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/cqlsh_tests.cqlsh_tests/TestCqlsh/test_describe/].
> Tested on: C* {{b4544846def2bdd00ff841c7e3d9f2559620827b}}
> Can be reproduced with the following:
> {code}
> ccm stop
> ccm remove describe_order
> ccm create -n 1 -v git:cassandra-2.2 describe_order
> ccm start
> cat << EOF | ccm node1 cqlsh
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> USE ks1;
> CREATE TABLE ks1.test (id int, col int, val text, val2 text, val3 text, 
> PRIMARY KEY(id, col));
> CREATE INDEX ix0 ON ks1.test (col);
> CREATE INDEX ix3 ON ks1.test (val3);
> CREATE INDEX ix2 ON ks1.test (val2);
> CREATE INDEX ix1 ON ks1.test (val);
> DESCRIBE KEYSPACE ks1;
> EOF
> ccm stop
> ccm setdir -v git:cassandra-3.0
> ccm start
> sleep 15
> cat << EOF | ccm node1 cqlsh
> DESCRIBE KEYSPACE ks1;
> EOF
> ccm stop
> {code}
> Ouput on <= cassandra-2.2:
> {code}
> CREATE INDEX ix0 ON ks1.test (col);
> CREATE INDEX ix1 ON ks1.test (val);
> CREATE INDEX ix2 ON ks1.test (val2);
> CREATE INDEX ix3 ON ks1.test (val3);
> {code}
> Output on cassandra-3.0:
> {code}
> CREATE INDEX ix2 ON ks1.test (val2);
> CREATE INDEX ix3 ON ks1.test (val3);
> CREATE INDEX ix0 ON ks1.test (col);
> CREATE INDEX ix1 ON ks1.test (val);
> {code}
> //CC [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-9830:
---
Component/s: Compaction

> Option to disable bloom filter in highest level of LCS sstables
> ---
>
> Key: CASSANDRA-9830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jonathan Ellis
>Assignee: Paulo Motta
>Priority: Minor
>  Labels: performance
> Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully 
> populated series.  (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually 
> been inserted, the bloom filter on the highest level only helps reject 
> sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level 
> sstables.  This will dramatically reduce memory usage for LCS and may even 
> improve performance as we no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10059) Test Coverage and related bug-fixes for AbstractBTreePartition and hierarchy

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10059:
--
Fix Version/s: 3.0.1

> Test Coverage and related bug-fixes for AbstractBTreePartition and hierarchy
> 
>
> Key: CASSANDRA-10059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10059
> Project: Cassandra
>  Issue Type: Test
>Reporter: Benedict
>Assignee: Branimir Lambov
> Fix For: 3.0.1, 3.1
>
>
> Follow up to CASSANDRA-9932. The test coverage for AbstractBTreePartition and 
> its hierarchy is entirely indirect. That is not to say it is not covered, but 
> we may have some unexplored behaviour. Coverage for BTree is also missing 
> around a couple of edges, and the gaps should be filled in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10200) NetworkTopologyStrategy.calculateNaturalEndpoints is rather inefficient

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10200:
--
Fix Version/s: (was: 3.0.x)
   3.0.1

> NetworkTopologyStrategy.calculateNaturalEndpoints is rather inefficient
> ---
>
> Key: CASSANDRA-10200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10200
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Minor
> Fix For: 3.0.1, 3.1
>
>
> The method is much more complicated than it needs to be and creates too many 
> maps and sets. The code is easy to simplify if we use the known number of 
> racks and nodes per datacentre to choose what to do in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10143) Apparent counter overcount during certain network partitions

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10143:
--
Fix Version/s: 3.0.1

> Apparent counter overcount during certain network partitions
> 
>
> Key: CASSANDRA-10143
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10143
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Aleksey Yeschenko
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> This issue is reproducible in this [Jepsen 
> Test|https://github.com/riptano/jepsen/blob/f45f5320db608d48de2c02c871aecc4910f4d963/cassandra/test/cassandra/counter_test.clj#L16].
> The test starts a five-node cluster and issues increments by one against a 
> single counter. It then checks that the counter is in the range [OKed 
> increments, OKed increments + Write Timeouts] at each read. Increments are 
> issued at CL.ONE and reads at CL.ALL.  Throughout the test, network failures 
> are induced that create halved network partitions. A halved network partition 
> splits the cluster into three connected nodes and two connected nodes, 
> randomly.
> This test started failing; bisects showed that it was actually a test change 
> that caused this failure. When the network partitions are induced in a cycle 
> of 15s healthy/45s partitioned or 20s healthy/45s partitioned, the test 
> failes. When network partitions are induced in a cycle of 15s healthy/60s 
> partitioned, 20s healthy/45s partitioned, or 20s healthy/60s partitioned, the 
> test passes.
> There is nothing unusual in the logs of the nodes for the failed tests. The 
> results are very reproducible.
> One noticeable trend is that more reads seem to get serviced during the 
> failed tests.
> Most testing has been done in 2.1.8 - the same issue appears to be present in 
> 2.2/3.0/trunk, but I haven't spent as much time reproducing.
> Ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9984) Improve error reporting for malformed schemas in stress profile

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9984:
-
Fix Version/s: 3.0.1

> Improve error reporting for malformed schemas in stress profile
> ---
>
> Key: CASSANDRA-9984
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9984
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jim Witschey
>Assignee: T Jake Luciani
>Priority: Trivial
> Fix For: 3.0.1, 3.1
>
>
> See this gist:
> https://gist.github.com/mambocab/a78fae8c356223245c63
> for an example of a profile that triggers the bug when used as a stress 
> profile on trunk. It contains a number of old, now unused, configuration 
> options in the table schema. The error raised when this schema is executed 
> isn't propagated because of improper error handling.
> To reproduce this error with CCM you can save the file in the gist above as 
> {{8-columns.yaml}} and run
> {code}
> ccm create -v git:trunk reproduce-error -n 1
> ccm start --wait-for-binary-proto
> ccm stress user profile=8-columns.yaml ops\(insert=1\) n=5K
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9669:
-
Fix Version/s: 3.0.1

> If sstable flushes complete out of order, on restart we can fail to replay 
> necessary commit log records
> ---
>
> Key: CASSANDRA-9669
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Benedict
>Assignee: Benedict
>Priority: Critical
>  Labels: correctness
> Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, 
> on restart we simply take the maximum replay position of any sstable on disk, 
> and ignore anything prior. 
> It is quite possible for there to be two flushes triggered for a given table, 
> and for the second to finish first by virtue of containing a much smaller 
> quantity of live data (or perhaps the disk is just under less pressure). If 
> we crash before the first sstable has been written, then on restart the data 
> it would have represented will disappear, since we will not replay the CL 
> records.
> This looks to be a bug present since time immemorial, and also seems pretty 
> serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9384) Update jBCrypt dependency to version 0.4

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9384:
-
Fix Version/s: 3.0.1

> Update jBCrypt dependency to version 0.4
> 
>
> Key: CASSANDRA-9384
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9384
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sam Tunnicliffe
>Assignee: Marko Denda
> Fix For: 2.1.12, 2.2.4, 3.0.1, 3.1
>
>
> https://bugzilla.mindrot.org/show_bug.cgi?id=2097
> Although the bug tracker lists it as NEW/OPEN, the release notes for 0.4 
> indicate that this is now fixed, so we should update.
> Thanks to [~Bereng] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9510) assassinating an unknown endpoint could npe

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9510:
-
Fix Version/s: 3.0.1

> assassinating an unknown endpoint could npe
> ---
>
> Key: CASSANDRA-9510
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9510
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 3.0.1, 3.1
>
> Attachments: assissinate_unknown.txt
>
>
> If the code assissinates an unknown endpoint, it doesn't generate a 'tokens' 
> collection, which then does
> epState.addApplicationState(ApplicationState.STATUS, 
> StorageService.instance.valueFactory.left(tokens, computeExpireTime()));
> and left(null, time); will npe



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9069) debug-cql broken in trunk

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9069:
-
Fix Version/s: (was: 3.1)

> debug-cql broken in trunk
> -
>
> Key: CASSANDRA-9069
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9069
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> {{debug-cql}} is broken on trunk.
> At startup it just says:
> {code}
> Error: Exception thrown by the agent : java.lang.NullPointerException
> {code}
> That exception originates from JMX agent (which cannot bind).
> It can be reproduced by starting C* locally and starting {{debug-cql}}.
> Workaround is to comment out sourcing of {{cassandra-env.sh}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9357) LongSharedExecutorPoolTest.testPromptnessOfExecution fails in 2.1

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9357:
-
Fix Version/s: 3.0.1

> LongSharedExecutorPoolTest.testPromptnessOfExecution fails in 2.1
> -
>
> Key: CASSANDRA-9357
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9357
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Shuler
> Fix For: 3.0.1, 3.1
>
> Attachments: system.log
>
>
> {noformat}
> [junit] Testsuite: 
> org.apache.cassandra.concurrent.LongSharedExecutorPoolTest
> [junit] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 1.353 sec
> [junit] 
> [junit] - Standard Output ---
> [junit] Completed 0K batches with 0.0M events
> [junit] Running for 120s with load multiplier 0.5
> [junit] -  ---
> [junit] Testcase: 
> testPromptnessOfExecution(org.apache.cassandra.concurrent.LongSharedExecutorPoolTest):
> FAILED
> [junit] null
> [junit] junit.framework.AssertionFailedError
> [junit] at 
> org.apache.cassandra.concurrent.LongSharedExecutorPoolTest.testPromptnessOfExecution(LongSharedExecutorPoolTest.java:215)
> [junit] at 
> org.apache.cassandra.concurrent.LongSharedExecutorPoolTest.testPromptnessOfExecution(LongSharedExecutorPoolTest.java:104)
> [junit] 
> [junit] 
> [junit] Test org.apache.cassandra.concurrent.LongSharedExecutorPoolTest 
> FAILED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7217:
-
Fix Version/s: 3.0.1

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8927) Mark libjna-java + libjna-jni as incompatible in debian package

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8927:
-
Fix Version/s: 3.0.1

> Mark libjna-java + libjna-jni as incompatible in debian package
> ---
>
> Key: CASSANDRA-8927
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8927
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
> Environment: Debian
>Reporter: Robert Stupp
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.1, 3.1
>
>
> Current Debian (Wheezy) might bring {{libjna-java}} in version 3.2.7-4, which 
> has incompatible {{libjnadispatch.so}} because since C* 2.1 we use JNA 4.0.0 
> (the native stuff changed):
> jna.jar includes all binaries for all supported platforms - so there's no 
> need for libjna installed separately.
> Since CASSANDRA-8714 has been committed, the incompatibility manifests in 
> {{java.lang.NoClassDefFoundError: Could not initialize class 
> com.sun.jna.Native}} (which is caused by outdated libjna-java installed via 
> apt).
> Note: Debian jessie adds new package {{libjna-jni}} (4.1.0-1) in addition to 
> {{libjna-java}} (4.1.0-1) - both contain the {{libjnidispatch.so}}. Although 
> these seem to work, we might hit the same issue when there's a need to 
> upgrade JNA to 4.2.x sometime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8965) Cassandra retains a file handle to the directory its writing to for each writer instance

2015-11-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-8965:
-
Fix Version/s: 3.0.1

> Cassandra retains a file handle to the directory its writing to for each 
> writer instance
> 
>
> Key: CASSANDRA-8965
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8965
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Priority: Trivial
> Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> We could either share this amongst the CF object, or have a shared 
> ref-counted cache that opens a reference and shares it amongst all writer 
> instances, closing it once they all close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9800) 2.2 eclipse-warnings

2015-11-16 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9800:
--
Component/s: Testing

> 2.2 eclipse-warnings
> 
>
> Key: CASSANDRA-9800
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9800
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Shuler
>Assignee: Ariel Weisberg
> Fix For: 2.2.x
>
>
> commit 05615a754e5bbb5299d51470a2ccdb70a5b0
> Date:   Mon Jul 13 17:53:42 2015 -0400
> If you wish to look at the latest output, check the 
> {{eclipse_compiler_checks.txt}} artifact saved on the latest build:
> http://cassci.datastax.com/job/cassandra-2.2_eclipse-warnings/lastBuild/
> Output of current 2.2 HEAD eclipse-warnings:
> {noformat}
> # 7/14/15 12:21:30 AM UTC
> # Eclipse Compiler for Java(TM) v20150120-1634, 3.10.2, Copyright IBM Corp 
> 2000, 2013. All rights reserved.
> incorrect classpath: 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/build/cobertura/classes
> --
> 1. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/tools/SSTableExport.java
>  (at line 315)
>   ISSTableScanner scanner = reader.getScanner();
>   ^^^
> Resource 'scanner' should be managed by try-with-resource
> --
> --
> 2. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java
>  (at line 247)
>   scanners.add(new LeveledScanner(intersecting, range));
>^^^
> Potential resource leak: '' may not be closed
> --
> --
> 3. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
>  (at line 819)
>   ISSTableScanner scanner = cleanupStrategy.getScanner(sstable, 
> getRateLimiter());
>   ^^^
> Resource 'scanner' should be managed by try-with-resource
> --
> --
> 4. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/db/WindowsFailedSnapshotTracker.java
>  (at line 55)
>   BufferedReader reader = new BufferedReader(new 
> FileReader(TODELETEFILE));
>  ^^
> Resource 'reader' should be managed by try-with-resource
> --
> 5. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/db/WindowsFailedSnapshotTracker.java
>  (at line 55)
>   BufferedReader reader = new BufferedReader(new 
> FileReader(TODELETEFILE));
>  ^^
> Resource 'reader' should be managed by try-with-resource
> --
> --
> 6. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/io/util/SegmentedFile.java
>  (at line 183)
>   ChannelProxy channelCopy = getChannel(path);
>^^^
> Resource 'channelCopy' should be managed by try-with-resource
> --
> 7. ERROR in 
> /mnt/data/jenkins/workspace/cassandra-2.2_eclipse-warnings/src/java/org/apache/cassandra/io/util/SegmentedFile.java
>  (at line 186)
>   return complete(channelCopy, overrideLength);
>   ^
> Potential resource leak: 'channelCopy' may not be closed at this location
> --
> 7 problems (7 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9355) RecoveryManagerTruncateTest fails in test-compression

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006842#comment-15006842
 ] 

Ariel Weisberg edited comment on CASSANDRA-9355 at 11/16/15 4:43 PM:
-

We could try that. It's a pretty invasive change in that tests that weren't 
expecting a sync would be getting a sync and might no longer test what they 
were expecting to test.

Alternate version
|[2.1 
code|https://github.com/apache/cassandra/compare/apache:cassandra-2.1...aweisberg:CASSANDRA-9355-2.1-v2?expand=1]|[utests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-9355-2.1-v2-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-9355-2.1-v2-dtest/]|



was (Author: aweisberg):
We could try that. It's a pretty invasive change in that tests that weren't 
expecting a sync would be getting a sync and might no longer test what they 
were expecting to test.

Alternate version
|[2.1 
code|https://github.com/apache/cassandra/compare/apache:cassandra-2.1...aweisberg:CASSANDRA-9355-2.1-v2?expand=1]|[utests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-9355-2.1-v2-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-9355-2.1-v2-dtest/]|


> RecoveryManagerTruncateTest fails in test-compression
> -
>
> Key: CASSANDRA-9355
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9355
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
> Environment: 2.1 commit: ac70e37
>Reporter: Michael Shuler
>Assignee: Ariel Weisberg
> Fix For: 2.1.x
>
> Attachments: system.log
>
>
> {noformat}
> $ ant test-compression -Dtest.name=RecoveryManagerTruncateTest
> ...
> [junit] Testsuite: org.apache.cassandra.db.RecoveryManagerTruncateTest
> [junit] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 9.221 sec
> [junit] 
> [junit] Testcase: 
> testTruncatePointInTimeReplayList(org.apache.cassandra.db.RecoveryManagerTruncateTest):
>FAILED
> [junit] 
> [junit] junit.framework.AssertionFailedError: 
> [junit] at 
> org.apache.cassandra.db.RecoveryManagerTruncateTest.testTruncatePointInTimeReplayList(RecoveryManagerTruncateTest.java:159)
> [junit] 
> [junit] 
> [junit] Test org.apache.cassandra.db.RecoveryManagerTruncateTest FAILED
> {noformat}
> system.log from just this failed test attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/3] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.1

2015-11-16 Thread snazy
Merge branch 'cassandra-3.0' into cassandra-3.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/36e76771
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/36e76771
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/36e76771

Branch: refs/heads/trunk
Commit: 36e7677147607236ab651014ebbff4df052e76be
Parents: b3231e9 c0480d8
Author: Robert Stupp 
Authored: Mon Nov 16 17:40:05 2015 +0100
Committer: Robert Stupp 
Committed: Mon Nov 16 17:40:05 2015 +0100

--
 .../cassandra/cql3/validation/operations/AggregationTest.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Updated] (CASSANDRA-9630) Killing cassandra process results in unclosed connections

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-9630:
---
Component/s: Streaming and Messaging
 Distributed Metadata

> Killing cassandra process results in unclosed connections
> -
>
> Key: CASSANDRA-9630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9630
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata, Streaming and Messaging
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.x
>
>
> After upgrading from Cassandra from 2.0.12 to 2.0.15, whenever we killed a 
> cassandra process (with SIGTERM), some other nodes maintained a connection 
> with the killed node in the CLOSE_WAIT state on port 7000 for about 5-20 
> minutes.
> So, when we started the killed node again, other nodes could not establish a 
> handshake because of the connections on the CLOSE_WAIT state, so they 
> remained on the DOWN state to each other until the initial connection expired.
> The problem did not happen if I ran a nodetool disablegossip before killing 
> the node.
> I was able to fix this issue by reverting the CASSANDRA-8336 commits 
> (including CASSANDRA-9238). After reverting this, cassandra now closes 
> connection correctly when killed with -TERM, but leaves connections on 
> CLOSE_WAIT state if I run nodetool disablethrift before killing the nodes.
> I did not try to reproduce the problem in a clean environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.1

2015-11-16 Thread snazy
Merge branch 'cassandra-3.0' into cassandra-3.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/36e76771
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/36e76771
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/36e76771

Branch: refs/heads/cassandra-3.1
Commit: 36e7677147607236ab651014ebbff4df052e76be
Parents: b3231e9 c0480d8
Author: Robert Stupp 
Authored: Mon Nov 16 17:40:05 2015 +0100
Committer: Robert Stupp 
Committed: Mon Nov 16 17:40:05 2015 +0100

--
 .../cassandra/cql3/validation/operations/AggregationTest.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




cassandra git commit: ninja-fix javac warning

2015-11-16 Thread snazy
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 c2320c92f -> c0480d8bb


ninja-fix javac warning


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c0480d8b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c0480d8b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c0480d8b

Branch: refs/heads/cassandra-3.0
Commit: c0480d8bbddf111e4cd7c67ef7c0daeec3ece2dc
Parents: c2320c9
Author: Robert Stupp 
Authored: Mon Nov 16 17:39:13 2015 +0100
Committer: Robert Stupp 
Committed: Mon Nov 16 17:39:13 2015 +0100

--
 .../cassandra/cql3/validation/operations/AggregationTest.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c0480d8b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
 
b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
index 1a532ac..a1fb68b 100644
--- 
a/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
+++ 
b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
@@ -1371,7 +1371,7 @@ public class AggregationTest extends CQLTester
  "INITCOND null");
 
 assertRows(execute("SELECT initcond FROM system_schema.aggregates 
WHERE keyspace_name=? AND aggregate_name=?", KEYSPACE, 
shortFunctionName(aggregation)),
-   row(null));
+   row((Object) null));
 
 assertRows(execute("SELECT " + aggregation + "(b) FROM %s"),
row(set(7, 8, 9)));



[jira] [Updated] (CASSANDRA-6992) Bootstrap on vnodes clusters can cause stampeding/storm behavior

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-6992:
---
Component/s: Streaming and Messaging

> Bootstrap on vnodes clusters can cause stampeding/storm behavior
> 
>
> Key: CASSANDRA-6992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6992
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Streaming and Messaging
> Environment: Various vnodes-enabled clusters in EC2, m1.xlarge and 
> hi1.4xlarge, ~3000-8000 tokens.
>Reporter: Rick Branson
>Assignee: Paulo Motta
>Priority: Minor
>
> Assuming this is an issue with vnodes clusters because 
> SSTableReader#getPositionsForRanges is more expensive to compute with 256x 
> the ranges, but could be wrong. On even well-provisioned hosts, this can 
> cause a severe spike in network throughput & CPU utilization from a storm of 
> flushes, which impacts long-tail times pretty badly. On weaker hosts (like 
> m1.xlarge with ~500GB of data), it can result in minutes of churn while the 
> node gets through StreamOut#createPendingFiles. This *might* be better in 
> 2.0, but it's probably still reproducible because the bootstrapping node 
> sends out all of it's streaming requests at once. 
> I'm thinking that this could be staggered at the bootstrapping node to avoid 
> the simultaneous spike across the whole cluster. Not sure on how to stagger 
> it besides something very naive like one-at-a-time with a pause. Maybe this 
> should also be throttled in StreamOut#createPendingFiles on the out-streaming 
> host? Any thoughts?
> From the stack dump of one of our weaker nodes that was struggling for a few 
> minutes just starting the StreamOut:
> "MiscStage:1" daemon prio=10 tid=0x0292f000 nid=0x688 runnable 
> [0x7f7b03df6000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:361)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
> at 
> org.apache.cassandra.io.sstable.IndexHelper$IndexInfo.deserialize(IndexHelper.java:187)
> at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:125)
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:889)
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:790)
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getPositionsForRanges(SSTableReader.java:730)
> at 
> org.apache.cassandra.streaming.StreamOut.createPendingFiles(StreamOut.java:172)
> at 
> org.apache.cassandra.streaming.StreamOut.transferSSTables(StreamOut.java:157)
> at 
> org.apache.cassandra.streaming.StreamOut.transferRanges(StreamOut.java:148)
> at 
> org.apache.cassandra.streaming.StreamOut.transferRanges(StreamOut.java:116)
> at 
> org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:44)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9748) Can't see other nodes when using multiple network interfaces

2015-11-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-9748:
---
   Priority: Minor  (was: Major)
Component/s: Streaming and Messaging

> Can't see other nodes when using multiple network interfaces
> 
>
> Key: CASSANDRA-9748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9748
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
> Environment: Cassandra 2.0.16; multi-DC configuration
>Reporter: Roman Bielik
>Assignee: Paulo Motta
>Priority: Minor
>  Labels: docs-impacting
> Fix For: 2.1.x, 2.2.x, 3.0.x
>
> Attachments: system_node1.log, system_node2.log
>
>
> The idea is to setup a multi-DC environment across 2 different networks based 
> on the following configuration recommendations:
> http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configMultiNetworks.html
> Each node has 2 network interfaces. One used as a private network (DC1: 
> 10.0.1.x and DC2: 10.0.2.x). The second one a "public" network where all 
> nodes can see each other (this one has a higher latency). 
> Using the following settings in cassandra.yaml:
> *seeds:* public IP (same as used in broadcast_address)
> *listen_address:* private IP
> *broadcast_address:* public IP
> *rpc_address:* 0.0.0.0
> *endpoint_snitch:* GossipingPropertyFileSnitch
> _(tried different combinations with no luck)_
> No firewall and no SSL/encryption used.
> The problem is that nodes do not see each other (a gossip problem I guess). 
> The nodetool ring/status shows only the local node but not the other ones 
> (even from the same DC).
> When I set listen_address to public IP, then everything works fine, but that 
> is not the required configuration.
> _Note: Not using EC2 cloud!_
> netstat -anp | grep -E "(7199|9160|9042|7000)"
> tcp0  0 0.0.0.0:71990.0.0.0:*   
> LISTEN  3587/java   
> tcp0  0 10.0.1.1:9160   0.0.0.0:*   
> LISTEN  3587/java   
> tcp0  0 10.0.1.1:9042   0.0.0.0:*   
> LISTEN  3587/java   
> tcp0  0 10.0.1.1:7000   0.0.0.0:*   
> LISTEN  3587/java   
> tcp0  0 127.0.0.1:7199  127.0.0.1:52874 
> ESTABLISHED 3587/java   
> tcp0  0 10.0.1.1:7199   10.0.1.1:39650  
> ESTABLISHED 3587/java 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10592) IllegalArgumentException in DataOutputBuffer.reallocate

2015-11-16 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-10592:
---
Component/s: Streaming and Messaging
 Local Write-Read Paths
 Compaction

> IllegalArgumentException in DataOutputBuffer.reallocate
> ---
>
> Key: CASSANDRA-10592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10592
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Local Write-Read Paths, Streaming and 
> Messaging
>Reporter: Sebastian Estevez
>Assignee: Ariel Weisberg
> Fix For: 3.1, 2.2.x
>
>
> CORRECTION-
> It turns out the exception occurs when running a read using a thrift jdbc 
> driver. Once you have loaded the data with stress below, run 
> SELECT * FROM "autogeneratedtest"."transaction_by_retailer" using this tool - 
> http://www.aquafold.com/aquadatastudio_downloads.html
>  
> The exception:
> {code}
> WARN  [SharedPool-Worker-1] 2015-10-22 12:58:20,792 
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.RuntimeException: java.lang.IllegalArgumentException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2366)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: java.lang.IllegalArgumentException: null
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:334) ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:63)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:57)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:374)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.BufferCell$Serializer.serialize(BufferCell.java:263)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:183)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:108)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:381)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:136)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:128)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1697)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2362)
>  ~[main/:na]
>   ... 4 common frames omitted
> {code}
> I was running this command:
> {code}
> tools/bin/cassandra-stress user 
> profile=~/Desktop/startup/stress/stress.yaml n=10 ops\(insert=1\) -rate 
> threads=30
> {code}
> Here's the stress.yaml UPDATED!
> {code}
> ### DML ### THIS IS UNDER CONSTRUCTION!!!
> # Keyspace Name
> keyspace: autogeneratedtest
> # The CQL for creating a keyspace (optional if it already

[jira] [Updated] (CASSANDRA-10702) Statement concerning default ParallelGCThreads in jvm.options is not correct

2015-11-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-10702:

Reviewer: Paulo Motta

> Statement concerning default ParallelGCThreads in jvm.options is not correct
> 
>
> Key: CASSANDRA-10702
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10702
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Nate McCall
>Assignee: Nate McCall
>Priority: Trivial
> Attachments: 10702.patch
>
>
> from {{jvm.options}}:
> bq. The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
> This is not correct. If there are more than eight CPUs, the default becomes 
> 5/8 of the number of CPUs rounded up to the nearest even number (it seems - 
> see below). See {{-XX:ParallelGCThreads=n}} secion of 
> http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html 
> Pretty easy to test with > 16 cores (as 5/8 of such is 10): turn on GC 
> logging, leave the defaults, and the G1GC output will show something like:
> {noformat}
> [Parallel Time: 342.6 ms, GC Workers: 16]
> {noformat}
> on a 24 core system in this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/3] cassandra git commit: ninja-fix javac warning

2015-11-16 Thread snazy
Repository: cassandra
Updated Branches:
  refs/heads/trunk 269c5d4f8 -> 0010fce6d


ninja-fix javac warning


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c0480d8b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c0480d8b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c0480d8b

Branch: refs/heads/trunk
Commit: c0480d8bbddf111e4cd7c67ef7c0daeec3ece2dc
Parents: c2320c9
Author: Robert Stupp 
Authored: Mon Nov 16 17:39:13 2015 +0100
Committer: Robert Stupp 
Committed: Mon Nov 16 17:39:13 2015 +0100

--
 .../cassandra/cql3/validation/operations/AggregationTest.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c0480d8b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
 
b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
index 1a532ac..a1fb68b 100644
--- 
a/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
+++ 
b/test/unit/org/apache/cassandra/cql3/validation/operations/AggregationTest.java
@@ -1371,7 +1371,7 @@ public class AggregationTest extends CQLTester
  "INITCOND null");
 
 assertRows(execute("SELECT initcond FROM system_schema.aggregates 
WHERE keyspace_name=? AND aggregate_name=?", KEYSPACE, 
shortFunctionName(aggregation)),
-   row(null));
+   row((Object) null));
 
 assertRows(execute("SELECT " + aggregation + "(b) FROM %s"),
row(set(7, 8, 9)));



  1   2   >