[jira] [Commented] (CASSANDRA-9870) Improve cassandra-stress graphing

2015-08-14 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697322#comment-14697322
 ] 

Shawn Kumar commented on CASSANDRA-9870:


Just a quick update: the code for this lives 
[here|https://github.com/shawnkumar/cstargraph]. I have built off what Ryan had 
already written out, but changes were quite significant since the code was 
previously pretty much limited to displaying raw metrics and organized for that 
purpose. Here are some things that have been implemented: support for multiple 
datasets per revision (see lat_all graph), support for baseline-requiring 
graphs (see throughput % improvement) and fixing/rebuilding the existing 
functions for these graphs (ie. scaling, legends, colouring etc.). The 
remaining things left include: boxplot support (currently working on this using 
d3plus library), logarithmic scaling, fleshing out data processing for 
remaining graphs, adding legend entries/changing line styles for different 
datasets under same revision and finally the aesthetic/UI changes - namely the 
'aggregating' screen showing all graphs. 

 Improve cassandra-stress graphing
 -

 Key: CASSANDRA-9870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9870
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Benedict
Assignee: Shawn Kumar
 Attachments: reads.svg


 CASSANDRA-7918 introduces graph output from a stress run, but these graphs 
 are a little limited. Attached to the ticket is an example of some improved 
 graphs which can serve as the *basis* for some improvements, which I will 
 briefly describe. They should not be taken as the exact end goal, but we 
 should aim for at least their functionality. Preferably with some Javascript 
 advantages thrown in, such as the hiding of datasets/graphs for clarity. Any 
 ideas for improvements are *definitely* encouraged.
 Some overarching design principles:
 * Display _on *one* screen_ all of the information necessary to get a good 
 idea of how two or more branches compare to each other. Ideally we will 
 reintroduce this, painting multiple graphs onto one screen, stretched to fit.
 * Axes must be truncated to only the interesting dimensions, to ensure there 
 is no wasted space.
 * Each graph displaying multiple kinds of data should use colour _and shape_ 
 to help easily distinguish the different datasets.
 * Each graph should be tailored to the data it is representing, and we should 
 have multiple views of each data.
 The data can roughly be partitioned into three kinds:
 * throughput
 * latency
 * gc
 These can each be viewed in different ways:
 * as a continuous plot of:
 ** raw data
 ** scaled/compared to a base branch, or other metric
 ** cumulatively
 * as box plots
 ** ideally, these will plot median, outer quartiles, outer deciles and 
 absolute limits of the distribution, so the shape of the data can be best 
 understood
 Each compresses the information differently, losing different information, so 
 that collectively they help to understand the data.
 Some basic rules for presentation that work well:
 * Latency information should be plotted to a logarithmic scale, to avoid high 
 latencies drowning out low ones
 * GC information should be plotted cumulatively, to avoid differing 
 throughputs giving the impression of worse GC. It should also have a line 
 that is rescaled by the amount of work (number of operations) completed
 * Throughput should be plotted as the actual numbers
 To walk the graphs top-left to bottom-right, we have:
 * Spot throughput comparison of branches to the baseline branch, as an 
 improvement ratio (which can of course be negative, but is not in this 
 example)
 * Raw throughput of all branches (no baseline)
 * Raw throughput as a box plot
 * Latency percentiles, compared to baseline. The percentage improvement at 
 any point in time vs baseline is calculated, and then multiplied by the 
 overall median for the entire run. This simply permits the non-baseline 
 branches to scatter their wins/loss around a relatively clustered line for 
 each percentile. It's probably the most dishonest graph but comparing 
 something like latency where each data point can have very high variance is 
 difficult, and this gives you an idea of clustering of improvements/losses.
 * Latency percentiles, raw, each with a different shape; lowest percentiles 
 plotted as a solid line as they vary least, with higher percentiles each 
 getting their own subtly different shape to scatter.
 * Latency box plots
 * GC time, plotted cumulatively and also scaled by work done
 * GC Mb, plotted cumulatively and also scaled by work done
 * GC time, raw
 * GC time as a box plot
 These do mostly introduce the concept of a baseline branch. It may be that, 
 ideally, this 

[jira] [Created] (CASSANDRA-9868) Archive commitlogs tests failing

2015-07-22 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-9868:
--

 Summary: Archive commitlogs tests failing
 Key: CASSANDRA-9868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9868
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Attachments: commitlog_archiving.properties

A number of archive commitlog dtests (snapshot_tests.py) are failing on trunk 
at the point in the tests where the node is asked to restore data from archived 
commitlogs. It appears that the snapshot functionality works, but the 
[assertion|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L312]
 regarding data that should have been restored from archived commitlogs fails. 
I also tested this manually on trunk and found no success in restoring either, 
so it appears to not just be a test issue. Should note that it seems archiving 
the commitlogs works (in that they are actually copied) and rather restoring 
them is the issue. Attached is a the commitlog properties file (to show the 
commands used).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart

2015-07-20 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9840:
---
Description: This test is currently failing on trunk. I've attached the 
test output and logs. It seems that the failure of the test doesn't necessarily 
have anything to do with global row/key caches - as on the initial loop of the 
test [neither are 
used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15]
 and we still hit failure. The test itself fails when a second validation of 
values after a cluster restart fails to capture deletes issued prior to the 
restart and first successful validation. However, if I add flushes prior to 
restarting the cluster the test completes successfully, implying an issue with 
loss of in-memory mutations due to the cluster restart. Initially I had though 
this might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that 
this test has been succeeding consistently on both 2.1 and 2.2 branch indicates 
there may be another issue at hand.  (was: This test is currently failing on 
trunk. I've attached the test output and logs. It seems that the failure of the 
test doesn't necessarily have anything to do with global row/key caches - as on 
the initial loop of the test [neither are 
used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15].
 The test itself fails when a second validation of values after a cluster 
restart fails to capture deletes issued prior to the restart and first 
successful validation. However, if I add flushes prior to restarting the 
cluster the test completes successfully, implying an issue with loss of 
in-memory mutations due to the cluster restart. Initially I had though this 
might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this 
test has been succeeding consistently on both 2.1 and 2.2 branch indicates 
there may be another issue at hand.)

 global_row_key_cache_test.py fails; loses mutations on cluster restart
 --

 Key: CASSANDRA-9840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.0.x

 Attachments: node1.log, node2.log, node3.log, noseout.txt


 This test is currently failing on trunk. I've attached the test output and 
 logs. It seems that the failure of the test doesn't necessarily have anything 
 to do with global row/key caches - as on the initial loop of the test 
 [neither are 
 used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15]
  and we still hit failure. The test itself fails when a second validation of 
 values after a cluster restart fails to capture deletes issued prior to the 
 restart and first successful validation. However, if I add flushes prior to 
 restarting the cluster the test completes successfully, implying an issue 
 with loss of in-memory mutations due to the cluster restart. Initially I had 
 though this might be due to CASSANDRA-9669, but as Benedict pointed out, the 
 fact that this test has been succeeding consistently on both 2.1 and 2.2 
 branch indicates there may be another issue at hand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart

2015-07-20 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9840:
---
Description: This test is currently failing on trunk. I've attached the 
test output and logs. It seems that the failure of the test doesn't necessarily 
have anything to do with global row/key caches - as on the initial loop of the 
test [neither are 
used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15].
 The test itself fails when a second validation of values after a cluster 
restart fails to capture deletes issued prior to the restart and first 
successful validation. However, if I add flushes prior to restarting the 
cluster the test completes successfully, implying an issue with loss of 
in-memory mutations due to the cluster restart. Initially I had though this 
might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this 
test has been succeeding consistently on both 2.1 and 2.2 branch indicates 
there may be another issue at hand.  (was: This test is currently failing on 
trunk. I've attached the test output and logs. It seems that the failure of the 
test doesn't necessarily have anything to do with global row/key caches - as on 
the initial loop of the test neither are used. The test itself fails when a 
second validation of values after a cluster restart fails to capture deletes 
issued prior to the restart and first successful validation. However, if I add 
flushes prior to restarting the cluster the test completes successfully, 
implying an issue with loss of in-memory mutations due to the cluster restart. 
Initially I had though this might be due to CASSANDRA-9669, but as Benedict 
pointed out, the fact that this test has been succeeding consistently on both 
2.1 and 2.2 branch indicates there may be another issue at hand.)

 global_row_key_cache_test.py fails; loses mutations on cluster restart
 --

 Key: CASSANDRA-9840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.0.x

 Attachments: node1.log, node2.log, node3.log, noseout.txt


 This test is currently failing on trunk. I've attached the test output and 
 logs. It seems that the failure of the test doesn't necessarily have anything 
 to do with global row/key caches - as on the initial loop of the test 
 [neither are 
 used|https://github.com/riptano/cassandra-dtest/blob/master/global_row_key_cache_test.py#L15].
  The test itself fails when a second validation of values after a cluster 
 restart fails to capture deletes issued prior to the restart and first 
 successful validation. However, if I add flushes prior to restarting the 
 cluster the test completes successfully, implying an issue with loss of 
 in-memory mutations due to the cluster restart. Initially I had though this 
 might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that 
 this test has been succeeding consistently on both 2.1 and 2.2 branch 
 indicates there may be another issue at hand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart

2015-07-20 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9840:
---
Description: This test is currently failing on trunk. I've attached the 
test output and logs. It seems that the failure of the test doesn't necessarily 
have anything to do with global row/key caches - as on the initial loop of the 
test neither are used. The test itself fails when a second validation of values 
after a cluster restart fails to capture deletes issued prior to the restart 
and first successful validation. However, if I add flushes prior to restarting 
the cluster the test completes successfully, implying an issue with loss of 
in-memory mutations due to the cluster restart. Initially I had though this 
might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this 
test has been succeeding consistently on both 2.1 and 2.2 branch indicates 
there may be another issue at hand.  (was: This test is currently failing on 
trunk. I've attached the test output and logs. The test itself fails when a 
second validation of values after a cluster restart fails to capture deletes 
issued prior to the restart and first successful validation. However, if I add 
flushes prior to restarting the cluster the test completes successfully, 
implying an issue with loss of in-memory mutations due to the cluster restart. 
Initially I had though this might be due to CASSANDRA-9669, but as Benedict 
pointed out, the fact that this test has been succeeding consistently on both 
2.1 and 2.2 branch indicates there may be another issue at hand.)

 global_row_key_cache_test.py fails; loses mutations on cluster restart
 --

 Key: CASSANDRA-9840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.0.x

 Attachments: node1.log, node2.log, node3.log, noseout.txt


 This test is currently failing on trunk. I've attached the test output and 
 logs. It seems that the failure of the test doesn't necessarily have anything 
 to do with global row/key caches - as on the initial loop of the test neither 
 are used. The test itself fails when a second validation of values after a 
 cluster restart fails to capture deletes issued prior to the restart and 
 first successful validation. However, if I add flushes prior to restarting 
 the cluster the test completes successfully, implying an issue with loss of 
 in-memory mutations due to the cluster restart. Initially I had though this 
 might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that 
 this test has been succeeding consistently on both 2.1 and 2.2 branch 
 indicates there may be another issue at hand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9840) global_row_key_cache_test.py fails; loses mutations on cluster restart

2015-07-17 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-9840:
--

 Summary: global_row_key_cache_test.py fails; loses mutations on 
cluster restart
 Key: CASSANDRA-9840
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9840
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.0.x
 Attachments: node1.log, node2.log, node3.log, noseout.txt

This test is currently failing on trunk. I've attached the test output and 
logs. The test itself fails when a second validation of values after a cluster 
restart fails to capture deletes issued prior to the restart and first 
successful validation. However, if I add flushes prior to restarting the 
cluster the test completes successfully, implying an issue with loss of 
in-memory mutations due to the cluster restart. Initially I had though this 
might be due to CASSANDRA-9669, but as Benedict pointed out, the fact that this 
test has been succeeding consistently on both 2.1 and 2.2 branch indicates 
there may be another issue at hand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9815) Tracing doesn't seem to display tombstones read accurately

2015-07-16 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629954#comment-14629954
 ] 

Shawn Kumar commented on CASSANDRA-9815:


Spoke with Aleksey regarding this and it looks like this is actually the 
correct behaviour and a consequence of CASSANDRA-9299. That being said we 
agreed the Read 0 live and 0 tombstone cells message should be modified for 
clarity.

 Tracing doesn't seem to display tombstones read accurately
 --

 Key: CASSANDRA-9815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9815
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Shawn Kumar
Priority: Minor
  Labels: tracing
 Fix For: 2.2.x

 Attachments: tracescreen.png


 It seems that where tracing 
 [once|http://stackoverflow.com/questions/27063508/how-to-get-tombstone-count-for-a-cql-query]
  tracked how many tombstones were read in a query, it no longer happens.
 Can reproduce with the following:
 1. Create a simple key, val table.
 2. Insert a couple rows
 3. Flush
 4. Delete a row, add an additional couple rows
 5. Flush
 6. Try to query deleted row or select *. 
 In the trace it never mentions reading a tombstoned cell, no matter what. 
 Instead you get a line like the following: Read 0 live and 0 tombstone cells 
 [SharedPool-Worker-3]. Attached is a screenshot of the trace. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9815) Tracing doesn't seem to display tombstones read accurately

2015-07-16 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-9815.

   Resolution: Not A Problem
Reproduced In: 2.2.0 rc2, 3.x  (was: 3.x, 2.2.0 rc2)

 Tracing doesn't seem to display tombstones read accurately
 --

 Key: CASSANDRA-9815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9815
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Shawn Kumar
Priority: Minor
  Labels: tracing
 Fix For: 2.2.x

 Attachments: tracescreen.png


 It seems that where tracing 
 [once|http://stackoverflow.com/questions/27063508/how-to-get-tombstone-count-for-a-cql-query]
  tracked how many tombstones were read in a query, it no longer happens.
 Can reproduce with the following:
 1. Create a simple key, val table.
 2. Insert a couple rows
 3. Flush
 4. Delete a row, add an additional couple rows
 5. Flush
 6. Try to query deleted row or select *. 
 In the trace it never mentions reading a tombstoned cell, no matter what. 
 Instead you get a line like the following: Read 0 live and 0 tombstone cells 
 [SharedPool-Worker-3]. Attached is a screenshot of the trace. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9814) test_scrub_collections_table in scrub_test.py fails; removes sstables

2015-07-15 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9814:
---
Summary: test_scrub_collections_table in scrub_test.py fails; removes 
sstables  (was: test_scrub_collections_table in scrub_test.py fails - possible 
data loss?)

 test_scrub_collections_table in scrub_test.py fails; removes sstables
 -

 Key: CASSANDRA-9814
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9814
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.x

 Attachments: node1.log, out.txt


 The test creates an index on a table with collections and attempts to scrub. 
 The error occurs after the scrub where somehow all relevant sstables are 
 removed, and an assertion in get_sstables fails (due to there not being any 
 sstables). Logs indicate a set of errors under CompactionExecutor related to 
 not being able to read rows. Attached is the test output (out.txt) and the 
 relevant log. Should note that my attempts to replicate this manually weren't 
 successful, so its possible it could be a test issue (though I don't see 
 why). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9814) test_scrub_collections_table in scrub_test.py fails - possible data loss?

2015-07-15 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9814:
---
Summary: test_scrub_collections_table in scrub_test.py fails - possible 
data loss?  (was: test_scrub_collections_table in scrub_test.py fails)

 test_scrub_collections_table in scrub_test.py fails - possible data loss?
 -

 Key: CASSANDRA-9814
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9814
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.x

 Attachments: node1.log, out.txt


 The test creates an index on a table with collections and attempts to scrub. 
 The error occurs after the scrub where somehow all relevant sstables are 
 removed, and an assertion in get_sstables fails (due to there not being any 
 sstables). Logs indicate a set of errors under CompactionExecutor related to 
 not being able to read rows. Attached is the test output (out.txt) and the 
 relevant log. Should note that my attempts to replicate this manually weren't 
 successful, so its possible it could be a test issue (though I don't see 
 why). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9814) test_scrub_collections_table in scrub_test.py fails

2015-07-15 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-9814:
--

 Summary: test_scrub_collections_table in scrub_test.py fails
 Key: CASSANDRA-9814
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9814
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Priority: Blocker
 Fix For: 3.x
 Attachments: node1.log, out.txt

The test creates an index on a table with collections and attempts to scrub. 
The error occurs after the scrub where somehow all relevant sstables are 
removed, and an assertion in get_sstables fails (due to there not being any 
sstables). Logs indicate a set of errors under CompactionExecutor related to 
not being able to read rows. Attached is the test output (out.txt) and the 
relevant log. Should note that my attempts to replicate this manually weren't 
successful, so its possible it could be a test issue (though I don't see why). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9815) Tracing doesn't seem to display tombstones read accurately

2015-07-15 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-9815:
--

 Summary: Tracing doesn't seem to display tombstones read accurately
 Key: CASSANDRA-9815
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9815
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Shawn Kumar
Priority: Minor
 Attachments: tracescreen.png

It seems that where tracing 
[once|http://stackoverflow.com/questions/27063508/how-to-get-tombstone-count-for-a-cql-query]
 tracked how many tombstones were read in a query, it no longer happens.

Can reproduce with the following:
1. Create a simple key, val table.
2. Insert a couple rows
3. Flush
4. Delete a row, add an additional couple rows
5. Flush
6. Try to query deleted row or select *. 

In the trace it never mentions reading a tombstoned cell, no matter what. 
Instead you get a line like the following: Read 0 live and 0 tombstone cells 
[SharedPool-Worker-3]. Attached is a screenshot of the trace. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9152) Offline tools tests

2015-06-05 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574885#comment-14574885
 ] 

Shawn Kumar edited comment on CASSANDRA-9152 at 6/5/15 5:30 PM:


This is complete, dtest coverage is as follows:
sstablescrubber - scrub_test.py
sstablelevelreset, sstableofflinerelevel, sstableverify - offline_tools_test.py
sstablerepairedset - incremental_repair_test.py
sstableloader - sstable_generation_loading_test.py


was (Author: shawn.kumar):
This is complete, dtest coverage is as follows:
sstablescrubber - scrub_test.py
sstablelevelreset, sstableofflinerelevel, sstableverify - offline_tools_test.py
sstablerepairedset - incremental_repair_test.py

 Offline tools tests
 ---

 Key: CASSANDRA-9152
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9152
 Project: Cassandra
  Issue Type: Test
Reporter: Marcus Eriksson
Assignee: Shawn Kumar
  Labels: retrospective_generated
 Fix For: 2.2.x


 we need more tests of our offline tools, sstablescrubber, sstablelevelreset, 
 sstableofflinerelevel, sstablerepairedset, sstabeloader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9152) Offline tools tests

2015-06-05 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-9152.

Resolution: Implemented

This is complete, dtest coverage is as follows:
sstablescrubber - scrub_test.py
sstablelevelreset, sstableofflinerelevel, sstableverify - offline_tools_test.py
sstablerepairedset - incremental_repair_test.py

 Offline tools tests
 ---

 Key: CASSANDRA-9152
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9152
 Project: Cassandra
  Issue Type: Test
Reporter: Marcus Eriksson
Assignee: Shawn Kumar
  Labels: retrospective_generated
 Fix For: 2.2.x


 we need more tests of our offline tools, sstablescrubber, sstablelevelreset, 
 sstableofflinerelevel, sstablerepairedset, sstabeloader



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-05-22 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-5791:
---
Tester: Shawn Kumar

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 2.2.0 beta 1

 Attachments: cassandra-5791-20150319.diff, 
 cassandra-5791-patch-3.diff, cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-8590) Test repairing large dataset after upgrade

2015-04-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar reassigned CASSANDRA-8590:
--

Assignee: Shawn Kumar  (was: Ryan McGuire)

 Test repairing large dataset after upgrade
 --

 Key: CASSANDRA-8590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8590
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Shawn Kumar

 * Write large dataset in multiple tables
 * upgrade
 * replace a few nodes
 * repair in round-robin fashion
 * ensure exit codes of cmd line tools are expected
 * verify data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8252) dtests that involve topology changes should verify system.peers on all nodes

2015-04-21 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505497#comment-14505497
 ] 

Shawn Kumar commented on CASSANDRA-8252:


Seems to be a number of failures when I tried incorporating into existing code. 
To try and clarify/separate from other tests, I wrote up a separate 
[dtest|https://github.com/riptano/cassandra-dtest/blob/peerstest/peers_test.py] 
to test some basic actions (bootstrapping, removing nodes) and am still running 
into failures on all of these. 

 dtests that involve topology changes should verify system.peers on all nodes
 

 Key: CASSANDRA-8252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8252
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Brandon Williams
Assignee: Shawn Kumar
 Fix For: 2.0.15, 2.1.5


 This is especially true for replace where I've discovered it's wrong in 
 1.2.19, which is sad because now it's too late to fix.  We've had a lot of 
 problems with incorrect/null system.peers, so after any topology change we 
 should verify it on every live node when everything is finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-9178) Test exposed JMX methods

2015-04-20 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar reassigned CASSANDRA-9178:
--

Assignee: Shawn Kumar

 Test exposed JMX methods
 

 Key: CASSANDRA-9178
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9178
 Project: Cassandra
  Issue Type: Test
Reporter: Carl Yeksigian
Assignee: Shawn Kumar

 [~thobbs] added support for JMX testing in dtests, and we have seen issues 
 related to nodetool testing in various different stages of execution. Tests 
 which exercise the different methods which nodetool calls should be added to 
 catch those issues early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-6335) Hints broken for nodes that change broadcast address

2015-04-20 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-6335.

Resolution: Cannot Reproduce

 Hints broken for nodes that change broadcast address
 

 Key: CASSANDRA-6335
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6335
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Rick Branson
Assignee: Shawn Kumar

 When a node changes it's broadcast address, the transition process works 
 properly, but hints that are destined for it can't be delivered because of 
 the address change. It produces an exception:
 java.lang.AssertionError: Missing host ID for 10.1.60.22
 at 
 org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:598)
 at 
 org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:567)
 at 
 org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1679)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-6335) Hints broken for nodes that change broadcast address

2015-04-13 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-6335:
---
Tester: Shawn Kumar

 Hints broken for nodes that change broadcast address
 

 Key: CASSANDRA-6335
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6335
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Rick Branson
Assignee: Ryan McGuire

 When a node changes it's broadcast address, the transition process works 
 properly, but hints that are destined for it can't be delivered because of 
 the address change. It produces an exception:
 java.lang.AssertionError: Missing host ID for 10.1.60.22
 at 
 org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:598)
 at 
 org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:567)
 at 
 org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1679)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-04-01 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-9056.

   Resolution: Fixed
Fix Version/s: 2.1.4
   2.0.14
   3.0
Reproduced In: 2.0.13, 2.1.3, 3.0  (was: 3.0, 2.1.3, 2.0.13)

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
Assignee: Marcus Eriksson
  Labels: compaction, dtcs
 Fix For: 3.0, 2.0.14, 2.1.4


 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 [dtcs_deletion_test|https://github.com/riptano/cassandra-dtest/blob/master/compaction_test.py#L115]
  in compaction_test.py. I tried applying the patch in CASSANDRA-8359 but 
 found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-04-01 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391165#comment-14391165
 ] 

Shawn Kumar edited comment on CASSANDRA-9056 at 4/1/15 6:36 PM:


It looks like this has been fixed by CASSANDRA-8359, and dtcs_deletion_test is 
now passing on relevant branches with Marcus' changes. I've committed Marcus' 
changes to dtests as well. [~jimplush], I suggest testing with the fix provided 
in CASSANDRA-8359. I will resolve this ticket, but please feel free to post a 
follow-up comment if you continue to have this problem.


was (Author: shawn.kumar):
It looks like this has been fixed by CASSANDRA-8359, and dtcs_deletion_test is 
now passing on relevant branches with Marcus' changes. I've committed Marcus' 
changes to dtests as well. Jim, I suggest testing with the fix provided in 
CASSANDRA-8359. I will resolve this ticket, but please feel free to post a 
follow-up comment if you continue to have this problem.

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
Assignee: Marcus Eriksson
  Labels: compaction, dtcs
 Fix For: 3.0, 2.1.4, 2.0.14


 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 [dtcs_deletion_test|https://github.com/riptano/cassandra-dtest/blob/master/compaction_test.py#L115]
  in compaction_test.py. I tried applying the patch in CASSANDRA-8359 but 
 found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-04-01 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391165#comment-14391165
 ] 

Shawn Kumar commented on CASSANDRA-9056:


It looks like this has been fixed by CASSANDRA-8359, and dtcs_deletion_test is 
now passing on relevant branches with Marcus' changes. I've committed Marcus' 
changes to dtests as well. Jim, I suggest testing with the fix provided in 
CASSANDRA-8359. I will resolve this ticket, but please feel free to post a 
follow-up comment if you continue to have this problem.

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
Assignee: Marcus Eriksson
  Labels: compaction, dtcs

 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 [dtcs_deletion_test|https://github.com/riptano/cassandra-dtest/blob/master/compaction_test.py#L115]
  in compaction_test.py. I tried applying the patch in CASSANDRA-8359 but 
 found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-03-27 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-9056:
--

 Summary: Tombstoned SSTables are not removed past 
max_sstable_age_days when using DTCS
 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar


When using DTCS, tombstoned sstables past max_sstable_age_days are not removed 
by minor compactions. I was able to reproduce this manually and also wrote a 
dtest (currently failing) which reproduces this issue: dtcs_deletion_test in 
compaction_test.py. I tried applying the patch in CASSANDRA-8359 but found that 
the test still fails with the same issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-03-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9056:
---
Description: When using DTCS, tombstoned sstables past max_sstable_age_days 
are not removed by minor compactions. I was able to reproduce this manually and 
also wrote a dtest (currently failing) which reproduces this issue: 
dtcs_deletion_test in compaction_test.py. I tried applying the patch in 
CASSANDRA-8359 but found that the test still fails with the same issue.  (was: 
When using DTCS, tombstoned sstables past max_sstable_age_days are not removed 
by minor compactions. I was able to reproduce this manually and also wrote a 
dtest (currently failing) which reproduces this issue: dtcs_deletion_test in 
compaction_test.py. I tried applying the patch in CASSANDRA-8359 but found that 
the test still fails with the same issue. )

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
  Labels: compaction, dtcs

 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 dtcs_deletion_test in compaction_test.py. I tried applying the patch in 
 CASSANDRA-8359 but found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-03-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9056:
---
Description: When using DTCS, tombstoned sstables past max_sstable_age_days 
are not removed by minor compactions. I was able to reproduce this manually and 
also wrote a dtest (currently failing) which reproduces this issue: 
[dtcs_deletion_test|https://github.com/riptano/cassandra-dtest/blob/master/compaction_test.py#L115]
 in compaction_test.py. I tried applying the patch in CASSANDRA-8359 but found 
that the test still fails with the same issue.  (was: When using DTCS, 
tombstoned sstables past max_sstable_age_days are not removed by minor 
compactions. I was able to reproduce this manually and also wrote a dtest 
(currently failing) which reproduces this issue: dtcs_deletion_test in 
compaction_test.py. I tried applying the patch in CASSANDRA-8359 but found that 
the test still fails with the same issue.)

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
Assignee: Marcus Eriksson
  Labels: compaction, dtcs

 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 [dtcs_deletion_test|https://github.com/riptano/cassandra-dtest/blob/master/compaction_test.py#L115]
  in compaction_test.py. I tried applying the patch in CASSANDRA-8359 but 
 found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9056) Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS

2015-03-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-9056:
---
Assignee: Marcus Eriksson

 Tombstoned SSTables are not removed past max_sstable_age_days when using DTCS
 -

 Key: CASSANDRA-9056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9056
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
Assignee: Marcus Eriksson
  Labels: compaction, dtcs

 When using DTCS, tombstoned sstables past max_sstable_age_days are not 
 removed by minor compactions. I was able to reproduce this manually and also 
 wrote a dtest (currently failing) which reproduces this issue: 
 dtcs_deletion_test in compaction_test.py. I tried applying the patch in 
 CASSANDRA-8359 but found that the test still fails with the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8870) Tombstone overwhelming issue aborts client queries

2015-03-12 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-8870.

Resolution: Cannot Reproduce

 Tombstone overwhelming issue aborts client queries
 --

 Key: CASSANDRA-8870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8870
 Project: Cassandra
  Issue Type: Bug
 Environment: cassandra 2.1.2 ubunbtu 12.04
Reporter: Jeff Liu

 We are getting client queries timeout issues on the clients who are trying to 
 query data from cassandra cluster. 
 Nodetool status shows that all nodes are still up regardless.
 Logs from client side:
 {noformat}
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
 tried for query failed (tried: 
 cass-chisel01.abc01.abc02.abc.abc.com/10.66.182.113:9042 
 (com.datastax.driver.core.TransportException: 
 [cass-chisel01.tgr01.iad02.testd.nestlabs.com/10.66.182.113:9042] Connection 
 has been closed))
 at 
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}
 Logs from cassandra/system.log
 {noformat}
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,410 SliceQueryFilter.java:212 - 
 Scanned over 10 tombstones in system.hints; query aborted (see 
 tombstone_failure_threshold)
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,417 CassandraDaemon.java:153 - 
 Exception in thread Thread[HintedHandoff:2,1,main]
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:214)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:107)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:81)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:310)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1858)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1666)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:344)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:94)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:555)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8870) Tombstone overwhelming issue aborts client queries

2015-03-09 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353202#comment-14353202
 ] 

Shawn Kumar commented on CASSANDRA-8870:


I created a few tests to try and reproduce this problem, and check more 
specifically whether 1) HintedHandoff exhibited any abnormal tombstone 
behaviour, 2) A TombstoneOverwhelmingException in system.hints would cause any 
other issues (NoHostAvailableException). I was not able to reproduce problems 
in either aspect. For aspect 2, I was able to artificially cause the 
TombstoneOverwhelmingException by having more hints than the 
tombstone_failure_threshold (and flushing) - but this would seem to be expected 
behaviour and I was still able to connect to the cluster. Jeff if you have any 
other information about the context of the error that would be useful ie - 
queries, schema's, usage, node status; please feel free to share them and I can 
give it another shot.

 Tombstone overwhelming issue aborts client queries
 --

 Key: CASSANDRA-8870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8870
 Project: Cassandra
  Issue Type: Bug
 Environment: cassandra 2.1.2 ubunbtu 12.04
Reporter: Jeff Liu

 We are getting client queries timeout issues on the clients who are trying to 
 query data from cassandra cluster. 
 Nodetool status shows that all nodes are still up regardless.
 Logs from client side:
 {noformat}
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
 tried for query failed (tried: 
 cass-chisel01.abc01.abc02.abc.abc.com/10.66.182.113:9042 
 (com.datastax.driver.core.TransportException: 
 [cass-chisel01.tgr01.iad02.testd.nestlabs.com/10.66.182.113:9042] Connection 
 has been closed))
 at 
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}
 Logs from cassandra/system.log
 {noformat}
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,410 SliceQueryFilter.java:212 - 
 Scanned over 10 tombstones in system.hints; query aborted (see 
 tombstone_failure_threshold)
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,417 CassandraDaemon.java:153 - 
 Exception in thread Thread[HintedHandoff:2,1,main]
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:214)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:107)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:81)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:310)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1858)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1666)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:344)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:94)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:555)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8870) Tombstone overwhelming issue aborts client queries

2015-03-09 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353236#comment-14353236
 ] 

Shawn Kumar commented on CASSANDRA-8870:


I believe the tombstone errors occur due to hints being deleted from 
system.hints table after a successful handoff. See 
http://www.datastax.com/dev/blog/modern-hinted-handoff for more info on 
HintedHandoff.

 Tombstone overwhelming issue aborts client queries
 --

 Key: CASSANDRA-8870
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8870
 Project: Cassandra
  Issue Type: Bug
 Environment: cassandra 2.1.2 ubunbtu 12.04
Reporter: Jeff Liu

 We are getting client queries timeout issues on the clients who are trying to 
 query data from cassandra cluster. 
 Nodetool status shows that all nodes are still up regardless.
 Logs from client side:
 {noformat}
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
 tried for query failed (tried: 
 cass-chisel01.abc01.abc02.abc.abc.com/10.66.182.113:9042 
 (com.datastax.driver.core.TransportException: 
 [cass-chisel01.tgr01.iad02.testd.nestlabs.com/10.66.182.113:9042] Connection 
 has been closed))
 at 
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) 
 ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}
 Logs from cassandra/system.log
 {noformat}
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,410 SliceQueryFilter.java:212 - 
 Scanned over 10 tombstones in system.hints; query aborted (see 
 tombstone_failure_threshold)
 ERROR [HintedHandoff:2] 2015-02-23 23:46:28,417 CassandraDaemon.java:153 - 
 Exception in thread Thread[HintedHandoff:2,1,main]
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:214)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:107)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:81)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:310)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1858)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1666)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:385)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:344)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:94)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:555)
  ~[apache-cassandra-2.1.2.jar:2.1.2]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8252) dtests that involve topology changes should verify system.peers on all nodes

2015-02-19 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328118#comment-14328118
 ] 

Shawn Kumar commented on CASSANDRA-8252:


Wasn't able to reproduce this manually - will figure out and finish up dtest 
changes.

 dtests that involve topology changes should verify system.peers on all nodes
 

 Key: CASSANDRA-8252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8252
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Brandon Williams
Assignee: Shawn Kumar
 Fix For: 2.1.4


 This is especially true for replace where I've discovered it's wrong in 
 1.2.19, which is sad because now it's too late to fix.  We've had a lot of 
 problems with incorrect/null system.peers, so after any topology change we 
 should verify it on every live node when everything is finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8252) dtests that involve topology changes should verify system.peers on all nodes

2015-02-10 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314492#comment-14314492
 ] 

Shawn Kumar commented on CASSANDRA-8252:


While trying to modify replace_address_test.py to check system.peers, I had 
some difficulty due to unexpected results. I committed a modified version of 
replace_address_test with some debug statements 
[here|https://github.com/riptano/cassandra-dtest/blob/topology/replace_address_test.py]
 to illustrate the changes in the table through the process. Issues I am having 
are: variable system.peers table after replacing across runs (although the 
replaced address does not appear, it seems rare that the replacing node is 
noticed by 1 and 2), upon trying to truncate the table and restart nodes I can 
occasionally see the replaced address (in this case 127.0.0.3). 

 dtests that involve topology changes should verify system.peers on all nodes
 

 Key: CASSANDRA-8252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8252
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Brandon Williams
Assignee: Shawn Kumar
 Fix For: 2.1.3, 2.0.13


 This is especially true for replace where I've discovered it's wrong in 
 1.2.19, which is sad because now it's too late to fix.  We've had a lot of 
 problems with incorrect/null system.peers, so after any topology change we 
 should verify it on every live node when everything is finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8613) Regression in mixed single and multi-column relation support

2015-01-22 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287983#comment-14287983
 ] 

Shawn Kumar commented on CASSANDRA-8613:


Managed to reproduce pretty easily on both branches. Seems that this is 
particular scenario isn't covered in dtests, will add to cql_tests to address 
this.

 Regression in mixed single and multi-column relation support
 

 Key: CASSANDRA-8613
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8613
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Benjamin Lerer
 Fix For: 2.1.3, 2.0.13


 In 2.0.6 through 2.0.8, a query like the following was supported:
 {noformat}
 SELECT * FROM mytable WHERE clustering_0 = ? AND (clustering_1, clustering_2) 
  (?, ?)
 {noformat}
 However, after CASSANDRA-6875, you'll get the following error:
 {noformat}
 Clustering columns may not be skipped in multi-column relations. They should 
 appear in the PRIMARY KEY order. Got (c, d)  (0, 0)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8248) Possible memory leak

2014-11-17 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214961#comment-14214961
 ] 

Shawn Kumar commented on CASSANDRA-8248:


Alex, have you determined the issue with resident being larger, and are you 
still seeing this problem? If there are any further details you can provide 
(are you carrying out incremental repairs? do compactions have any affect?), 
they would be much appreciated.

 Possible memory leak 
 -

 Key: CASSANDRA-8248
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8248
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Shawn Kumar
 Attachments: thread_dump


 Sometimes during repair cassandra starts to consume more memory than expected.
 Total amount of data on node is about 20GB.
 Size of the data directory is 66GC because of snapshots.
 Top reports: 
 {noformat}
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 15724 loadbase  20   0  493g  55g  44g S   28 44.2   4043:24 java
 {noformat}
 At the /proc/15724/maps there are a lot of deleted file maps
 {quote}
 7f63a6102000-7f63a6332000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6332000-7f63a6562000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6562000-7f63a6792000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6792000-7f63a69c2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a69c2000-7f63a6bf2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6bf2000-7f63a6e22000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6e22000-7f63a7052000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7052000-7f63a7282000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7282000-7f63a74b2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a74b2000-7f63a76e2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a76e2000-7f63a7912000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7912000-7f63a7b42000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7b42000-7f63a7d72000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7d72000-7f63a7fa2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7fa2000-7f63a81d2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a81d2000-7f63a8402000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a8402000-7f63a8622000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a8622000-7f63a8842000 r--s  08:21 9442763
 

[jira] [Assigned] (CASSANDRA-8248) Possible memory leak

2014-11-12 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar reassigned CASSANDRA-8248:
--

Assignee: Shawn Kumar

 Possible memory leak 
 -

 Key: CASSANDRA-8248
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8248
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Sterligov
Assignee: Shawn Kumar
 Attachments: thread_dump


 Sometimes during repair cassandra starts to consume more memory than expected.
 Total amount of data on node is about 20GB.
 Size of the data directory is 66GC because of snapshots.
 Top reports: 
 {quote}
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 15724 loadbase  20   0  493g  55g  44g S   28 44.2   4043:24 java
 {quote}
 At the /proc/15724/maps there are a lot of deleted file maps
 {quote}
 7f63a6102000-7f63a6332000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6332000-7f63a6562000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6562000-7f63a6792000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6792000-7f63a69c2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a69c2000-7f63a6bf2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6bf2000-7f63a6e22000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a6e22000-7f63a7052000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7052000-7f63a7282000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7282000-7f63a74b2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a74b2000-7f63a76e2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a76e2000-7f63a7912000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7912000-7f63a7b42000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7b42000-7f63a7d72000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7d72000-7f63a7fa2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a7fa2000-7f63a81d2000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a81d2000-7f63a8402000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a8402000-7f63a8622000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a8622000-7f63a8842000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 7f63a8842000-7f63a8a62000 r--s  08:21 9442763
 /ssd/cassandra/data/iss/feedback_history-d32bc7e048c011e49b989bc3e8a5a440/iss-feedback_history-tmplink-ka-328671-Index.db
  (deleted)
 

[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-11-06 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200604#comment-14200604
 ] 

Shawn Kumar commented on CASSANDRA-7217:


I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last and it seems like there is a 
noticeable drop.

 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Shawn Kumar
  Labels: performance, triaged
 Fix For: 2.1.2


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-11-06 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200604#comment-14200604
 ] 

Shawn Kumar edited comment on CASSANDRA-7217 at 11/6/14 6:32 PM:
-

I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last (ie. 1_write to 5_write) and it 
seems like there is a noticeable drop in performance especially around 1000 
threads.


was (Author: shawn.kumar):
I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last and it seems like there is a 
noticeable drop.

 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Shawn Kumar
  Labels: performance, triaged
 Fix For: 2.1.2


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-11-06 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200604#comment-14200604
 ] 

Shawn Kumar edited comment on CASSANDRA-7217 at 11/6/14 6:46 PM:
-

I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last (ie. 1_write to 5_write) and it 
seems like there is a noticeable drop in performance especially around 1250 
threads.


was (Author: shawn.kumar):
I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last (ie. 1_write to 5_write) and it 
seems like there is a noticeable drop in performance especially around 1000 
threads.

 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Shawn Kumar
  Labels: performance, triaged
 Fix For: 2.1.2


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed

2014-11-04 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196340#comment-14196340
 ] 

Shawn Kumar commented on CASSANDRA-8061:


Hi Catalin, thanks for commenting. I've been trying to reproduce this issue to 
no avail and would greatly appreciate if you could share any details of your 
setup that could be helpful - especially the column family details (as above). 
Thanks!

 tmplink files are not removed
 -

 Key: CASSANDRA-8061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Linux
Reporter: Gianluca Borello
Assignee: Shawn Kumar

 After installing 2.1.0, I'm experiencing a bunch of tmplink files that are 
 filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 
 and that is very similar, and I confirm it happens both on 2.1.0 as well as 
 from the latest commit on the cassandra-2.1 branch 
 (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685
  from the cassandra-2.1)
 Even starting with a clean keyspace, after a few hours I get:
 $ sudo find /raid0 | grep tmplink | xargs du -hs
 2.7G  
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db
 13M   
 /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db
 1.8G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db
 12M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db
 5.2M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db
 822M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db
 7.3M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db
 1.2G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db
 6.7M  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db
 1.1G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db
 11M   
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db
 1.7G  
 /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db
 812K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db
 122M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db
 744K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db
 660K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db
 796K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db
 137M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db
 139M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db
 940K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db
 936K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db
 161M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db
 672K  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db
 113M  
 /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Data.db
 116M  
 

[jira] [Resolved] (CASSANDRA-8129) Increase max heap for sstablesplit

2014-11-04 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-8129.

Resolution: Cannot Reproduce

 Increase max heap for sstablesplit
 --

 Key: CASSANDRA-8129
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8129
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Stump
Assignee: Shawn Kumar
Priority: Minor

 The max heap for sstablesplit is 256m. For large files that's too small and 
 it will OOM. We should increase the max heap to something like 2-4G with the 
 understanding that sstablesplit will only most likely be invoked to split 
 large files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-10-31 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192027#comment-14192027
 ] 

Shawn Kumar commented on CASSANDRA-8008:


As you recommended, increasing Xmx fixed the issue for me. I think this ticket 
is resolved, though I'm not sure if you want to increase the default.

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Shawn Kumar
Assignee: T Jake Luciani
Priority: Minor
 Attachments: file.log, node1-2.log, node1.log, node2-2.log, 
 node2.log, perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Attachment: (was: perftest.log)

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1-2.log, node1.log, node2-2.log, node2.log, 
 perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Attachment: perftest.log

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1-2.log, node1.log, node2-2.log, node2.log, 
 perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7766) Secondary index not working after a while

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7766:
---
Description: 
Since 2.1.0-rc2, it appears that the secondary indexes are not always working. 
Immediately after the INSERT of a row, the index seems to be there. But after a 
while (I do not know when or why), SELECT statements based on any secondary 
index do not return the corresponding row(s) anymore. I noticed that a restart 
of C* may have an impact (the data inserted before the restart may be seen 
through the index, even if it was not returned before the restart).

Here is a use-case example (in order to clarify my request) :
{code}
CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
{code}

The last SELECT statement may or may not return a row depending on the instant 
of the request. I experienced that with 2.1.0-rc5 through CQLSH with clusters 
of one and two nodes. Since it depends on the instant of the request, I am not 
able to deliver any way to reproduce that systematically (It appears to be 
linked with some scheduled job inside C*).


  was:
cSince 2.1.0-rc2, it appears that the secondary indexes are not always working. 
Immediately after the INSERT of a row, the index seems to be there. But after a 
while (I do not know when or why), SELECT statements based on any secondary 
index do not return the corresponding row(s) anymore. I noticed that a restart 
of C* may have an impact (the data inserted before the restart may be seen 
through the index, even if it was not returned before the restart).

Here is a use-case example (in order to clarify my request) :
{code}
CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
{code}

The last SELECT statement may or may not return a row depending on the instant 
of the request. I experienced that with 2.1.0-rc5 through CQLSH with clusters 
of one and two nodes. Since it depends on the instant of the request, I am not 
able to deliver any way to reproduce that systematically (It appears to be 
linked with some scheduled job inside C*).



 Secondary index not working after a while
 -

 Key: CASSANDRA-7766
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7766
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.0-rc5 with small clusters (one or two nodes)
Reporter: Fabrice Larcher
 Attachments: result-failure.txt, result-success.txt


 Since 2.1.0-rc2, it appears that the secondary indexes are not always 
 working. Immediately after the INSERT of a row, the index seems to be there. 
 But after a while (I do not know when or why), SELECT statements based on any 
 secondary index do not return the corresponding row(s) anymore. I noticed 
 that a restart of C* may have an impact (the data inserted before the restart 
 may be seen through the index, even if it was not returned before the 
 restart).
 Here is a use-case example (in order to clarify my request) :
 {code}
 CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
 CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
 INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
 SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
 {code}
 The last SELECT statement may or may not return a row depending on the 
 instant of the request. I experienced that with 2.1.0-rc5 through CQLSH with 
 clusters of one and two nodes. Since it depends on the instant of the 
 request, I am not able to deliver any way to reproduce that systematically 
 (It appears to be linked with some scheduled job inside C*).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7766) Secondary index not working after a while

2014-09-29 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151761#comment-14151761
 ] 

Shawn Kumar commented on CASSANDRA-7766:


Was unable to reproduce this over a period of a few days. Please feel free to 
reopen the ticket if you come across any further information that could help us 
reproduce this.

 Secondary index not working after a while
 -

 Key: CASSANDRA-7766
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7766
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.0-rc5 with small clusters (one or two nodes)
Reporter: Fabrice Larcher
 Attachments: result-failure.txt, result-success.txt


 Since 2.1.0-rc2, it appears that the secondary indexes are not always 
 working. Immediately after the INSERT of a row, the index seems to be there. 
 But after a while (I do not know when or why), SELECT statements based on any 
 secondary index do not return the corresponding row(s) anymore. I noticed 
 that a restart of C* may have an impact (the data inserted before the restart 
 may be seen through the index, even if it was not returned before the 
 restart).
 Here is a use-case example (in order to clarify my request) :
 {code}
 CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
 CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
 INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
 SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
 {code}
 The last SELECT statement may or may not return a row depending on the 
 instant of the request. I experienced that with 2.1.0-rc5 through CQLSH with 
 clusters of one and two nodes. Since it depends on the instant of the 
 request, I am not able to deliver any way to reproduce that systematically 
 (It appears to be linked with some scheduled job inside C*).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7766) Secondary index not working after a while

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7766.

Resolution: Cannot Reproduce

 Secondary index not working after a while
 -

 Key: CASSANDRA-7766
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7766
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.0-rc5 with small clusters (one or two nodes)
Reporter: Fabrice Larcher
 Attachments: result-failure.txt, result-success.txt


 Since 2.1.0-rc2, it appears that the secondary indexes are not always 
 working. Immediately after the INSERT of a row, the index seems to be there. 
 But after a while (I do not know when or why), SELECT statements based on any 
 secondary index do not return the corresponding row(s) anymore. I noticed 
 that a restart of C* may have an impact (the data inserted before the restart 
 may be seen through the index, even if it was not returned before the 
 restart).
 Here is a use-case example (in order to clarify my request) :
 {code}
 CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
 CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
 INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
 SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
 {code}
 The last SELECT statement may or may not return a row depending on the 
 instant of the request. I experienced that with 2.1.0-rc5 through CQLSH with 
 clusters of one and two nodes. Since it depends on the instant of the 
 request, I am not able to deliver any way to reproduce that systematically 
 (It appears to be linked with some scheduled job inside C*).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-6898) describing a table with compression should expand the compression options

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-6898.

Resolution: Fixed

 describing a table with compression should expand the compression options
 -

 Key: CASSANDRA-6898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6898
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Brandon Williams
Priority: Minor
 Fix For: 2.0.11


 {noformat}
 cqlsh:foo CREATE TABLE baz ( foo text, bar text, primary KEY (foo)) WITH 
 compression = {};
 cqlsh:foo DESCRIBE TABLE baz;
 CREATE TABLE baz (
   foo text,
   bar text,
   PRIMARY KEY (foo)
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={};
 cqlsh:foo
 {noformat}
 From this, you can't tell that LZ4 compression is enabled, even though it is. 
  It would be more friendly to expand the option to show the defaults.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-29 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151959#comment-14151959
 ] 

Shawn Kumar commented on CASSANDRA-8008:


Was able to reproduce locally on a single node running stock 2.1.0 as well, 
attached an additional thread dump.

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1-2.log, node1.log, node2-2.log, node2.log, 
 perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Attachment: file.log

Locally on single node of 2.1.0

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: file.log, node1-2.log, node1.log, node2-2.log, 
 node2.log, perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-29 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Comment: was deleted

(was: Locally on single node of 2.1.0)

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: file.log, node1-2.log, node1.log, node2-2.log, 
 node2.log, perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-26 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-8008:
--

 Summary: Timed out waiting for timer thread on large stress 
command
 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1.log, node2.log

I've been using cstar_perf to test cassandra with different gc's and came 
across this error on one run which effectively stopped the test:

java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
got stuck at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)

It looks similar to CASSANDRA-6943, but that should have fixed it, and I 
haven't been able to consistently replicate this with other runs. This 
particular run was stress writing/reading about 300M keys, and is an early 
attempt at carrying out a test of this size so perhaps it only manifests with 
larger tests. 

The modifications from stock 2.1.0 were changes to heap size and usage of g1gc, 
as well as using offheap_objects. I have attached thread dumps from the nodes 
in question, hopefully they capture the broken state. I am continuing to test 
this, and will see if I can reproduce this again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-26 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Description: 
I've been using cstar_perf to test a performance scenario and was able to 
reproduce this error on stock 2.1.0 while carrying out large stress writes (50M 
keys):
{noformat}
java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
got stuck
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
at 
org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
at 
org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
at 
org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
at java.lang.Thread.run(Thread.java:745)

{noformat}
It looks similar to CASSANDRA-6943, but that should have fixed it, and I 
haven't been able to consistently replicate this with other runs. This 
particular run was stress writing/reading about 300M keys, and is an early 
attempt at carrying out a test of this size so perhaps it only manifests with 
larger tests. 

  was:
I've been using cstar_perf to test cassandra with different gc's and came 
across this error on one run which effectively stopped the test:

java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
got stuck at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)

It looks similar to CASSANDRA-6943, but that should have fixed it, and I 
haven't been able to consistently replicate this with other runs. This 
particular run was stress writing/reading about 300M keys, and is an early 
attempt at carrying out a test of this size so perhaps it only manifests with 
larger tests. 

The modifications from stock 2.1.0 were changes to heap size and usage of g1gc, 
as well as using offheap_objects. I have attached thread dumps from the nodes 
in question, hopefully they capture the broken state. I am continuing to test 
this, and will see if I can reproduce this again.


 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1.log, node2.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on stock 2.1.0 while carrying out large stress writes 
 (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks similar to CASSANDRA-6943, but that should have fixed it, and I 
 haven't been able to consistently replicate this with other runs. This 
 particular run was stress writing/reading about 300M keys, and is an early 
 attempt at carrying out a test of this size so perhaps it only manifests with 
 larger tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-26 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Description: 
I've been using cstar_perf to test a performance scenario and was able to 
reproduce this error on a two node cluster with stock 2.1.0 while carrying out 
large stress writes (50M keys):
{noformat}
java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
got stuck
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
at 
org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
at 
org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
at 
org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
at java.lang.Thread.run(Thread.java:745)
{noformat}
It looks like a similar error to that found in CASSANDRA-6943. I've also 
attached the test log and thread dumps. 

  was:
I've been using cstar_perf to test a performance scenario and was able to 
reproduce this error on stock 2.1.0 while carrying out large stress writes (50M 
keys):
{noformat}
java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
got stuck
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
at 
org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
at 
org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
at 
org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
at java.lang.Thread.run(Thread.java:745)

{noformat}
It looks similar to CASSANDRA-6943, but that should have fixed it, and I 
haven't been able to consistently replicate this with other runs. This 
particular run was stress writing/reading about 300M keys, and is an early 
attempt at carrying out a test of this size so perhaps it only manifests with 
larger tests. 


 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1.log, node2.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-26 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Attachment: perftest.log

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1.log, node2.log, perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8008) Timed out waiting for timer thread on large stress command

2014-09-26 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-8008:
---
Attachment: node2-2.log
node1-2.log

 Timed out waiting for timer thread on large stress command
 

 Key: CASSANDRA-8008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8008
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Shawn Kumar
 Attachments: node1-2.log, node1.log, node2-2.log, node2.log, 
 perftest.log


 I've been using cstar_perf to test a performance scenario and was able to 
 reproduce this error on a two node cluster with stock 2.1.0 while carrying 
 out large stress writes (50M keys):
 {noformat}
 java.lang.RuntimeException: Timed out waiting for a timer thread - seems one 
 got stuck
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:83)
 at org.apache.cassandra.stress.util.Timing.snap(Timing.java:118)
 at 
 org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
 at 
 org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:42)
 at 
 org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 It looks like a similar error to that found in CASSANDRA-6943. I've also 
 attached the test log and thread dumps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7406) Reset version when closing incoming socket in IncomingTcpConnection should be done atomically

2014-09-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7406:
---
Reproduced In: 2.0.6

 Reset version when closing incoming socket in IncomingTcpConnection should be 
 done atomically
 -

 Key: CASSANDRA-7406
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7406
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: CentOS release 5.5 (Tikanga)
Reporter: Ray Chen

 When closing incoming socket, the close() method will call 
 MessagingService.resetVersion(), this behavior may clear version which is set 
 by another thread.  
 This could cause MessagingService.knowsVersion(endpoint) test results as 
 false (expect true here).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7303) OutOfMemoryError during prolonged batch processing

2014-09-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7303.

Resolution: Won't Fix

 OutOfMemoryError during prolonged batch processing
 --

 Key: CASSANDRA-7303
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7303
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Server: RedHat 6, 64-bit, Oracle JDK 7, Cassandra 2.0.6
 Client: Java 7, Astyanax
Reporter: Jacek Furmankiewicz
  Labels: crash, outofmemory, qa-resolved

 We have a prolonged batch processing job. 
 It writes a lot of records, every batch mutation creates probably on average 
 300-500 columns per row key (with many disparate row keys).
 It works fine but within a few hours we get error like this:
 ERROR [Thrift:15] 2014-05-24 14:16:20,192 CassandraDaemon.java (line |
 |196) Except  |
 |ion in thread Thread[Thrift:15,5,main]   |
 |java.lang.OutOfMemoryError: Requested array size exceeds VM limit|
 |at java.util.Arrays.copyOf(Arrays.java:2271) |
 |at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)|
 |at java.io.ByteArrayOutputStream.ensureCapacity  |
 |(ByteArrayOutputStream.ja|
 |va:93)   |
 |at java.io.ByteArrayOutputStream.write   |
 |(ByteArrayOutputStream.java:140) |
 |at org.apache.thrift.transport.TFramedTransport.write|
 |(TFramedTransport.j  |
 |ava:146) |
 |at org.apache.thrift.protocol.TBinaryProtocol.writeBinary|
 |(TBinaryProtoco  |
 |l.java:183)  |
 |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write |
 |(Column. |
 |java:678)|
 |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write |
 |(Column. |
 |java:611)|
 |at org.apache.cassandra.thrift.Column.write(Column.java:538) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn   |
 |$ColumnOrSuperColumnSt   |
 |andardScheme.write(ColumnOrSuperColumn.java:673) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn   |
 |$ColumnOrSuperColumnSt   |
 |andardScheme.write(ColumnOrSuperColumn.java:607) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn.write |
 |(ColumnOrSuperCo |
 |lumn.java:517)   |
 |at org.apache.cassandra.thrift.Cassandra$get_slice_result|
 |$get_slice_resu  |
 |ltStandardScheme.write(Cassandra.java:11682) |
 |at org.apache.cassandra.thrift.Cassandra$get_slice_result|
 |$get_slice_resu  |
 |ltStandardScheme.write(Cassandra.java:11603) |
 |at org.apache.cassandra.thrift.Cassandra
 The server already has 16 GB heap, which we hear is the max Cassandra can run 
 with. The writes are heavily multi-threaded from a single server.
 The jist of the issue is that Cassandra should not crash with OOM when under 
 heavy load. It is  OK  to slow down, even maybe start throwing operation 
 timeout exceptions, etc.
 But to just crash in the middle of the processing should not be allowed.
 is there any internal monitoring of heap usage in Cassandra where it could 
 detect that it is getting close to the heap limit and start throttling the 
 incoming requests to avoid this type of error?
 Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7303) OutOfMemoryError during prolonged batch processing

2014-09-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7303:
---
Labels: crash outofmemory qa-resolved  (was: crash outofmemory)

 OutOfMemoryError during prolonged batch processing
 --

 Key: CASSANDRA-7303
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7303
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Server: RedHat 6, 64-bit, Oracle JDK 7, Cassandra 2.0.6
 Client: Java 7, Astyanax
Reporter: Jacek Furmankiewicz
  Labels: crash, outofmemory, qa-resolved

 We have a prolonged batch processing job. 
 It writes a lot of records, every batch mutation creates probably on average 
 300-500 columns per row key (with many disparate row keys).
 It works fine but within a few hours we get error like this:
 ERROR [Thrift:15] 2014-05-24 14:16:20,192 CassandraDaemon.java (line |
 |196) Except  |
 |ion in thread Thread[Thrift:15,5,main]   |
 |java.lang.OutOfMemoryError: Requested array size exceeds VM limit|
 |at java.util.Arrays.copyOf(Arrays.java:2271) |
 |at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)|
 |at java.io.ByteArrayOutputStream.ensureCapacity  |
 |(ByteArrayOutputStream.ja|
 |va:93)   |
 |at java.io.ByteArrayOutputStream.write   |
 |(ByteArrayOutputStream.java:140) |
 |at org.apache.thrift.transport.TFramedTransport.write|
 |(TFramedTransport.j  |
 |ava:146) |
 |at org.apache.thrift.protocol.TBinaryProtocol.writeBinary|
 |(TBinaryProtoco  |
 |l.java:183)  |
 |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write |
 |(Column. |
 |java:678)|
 |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write |
 |(Column. |
 |java:611)|
 |at org.apache.cassandra.thrift.Column.write(Column.java:538) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn   |
 |$ColumnOrSuperColumnSt   |
 |andardScheme.write(ColumnOrSuperColumn.java:673) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn   |
 |$ColumnOrSuperColumnSt   |
 |andardScheme.write(ColumnOrSuperColumn.java:607) |
 |at org.apache.cassandra.thrift.ColumnOrSuperColumn.write |
 |(ColumnOrSuperCo |
 |lumn.java:517)   |
 |at org.apache.cassandra.thrift.Cassandra$get_slice_result|
 |$get_slice_resu  |
 |ltStandardScheme.write(Cassandra.java:11682) |
 |at org.apache.cassandra.thrift.Cassandra$get_slice_result|
 |$get_slice_resu  |
 |ltStandardScheme.write(Cassandra.java:11603) |
 |at org.apache.cassandra.thrift.Cassandra
 The server already has 16 GB heap, which we hear is the max Cassandra can run 
 with. The writes are heavily multi-threaded from a single server.
 The jist of the issue is that Cassandra should not crash with OOM when under 
 heavy load. It is  OK  to slow down, even maybe start throwing operation 
 timeout exceptions, etc.
 But to just crash in the middle of the processing should not be allowed.
 is there any internal monitoring of heap usage in Cassandra where it could 
 detect that it is getting close to the heap limit and start throttling the 
 incoming requests to avoid this type of error?
 Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7861) Node is not able to gossip

2014-09-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7861:
---
Labels: qa-resolved  (was: )

 Node is not able to gossip
 --

 Key: CASSANDRA-7861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7861
 Project: Cassandra
  Issue Type: Bug
Reporter: Ananthkumar K S
  Labels: qa-resolved
 Fix For: 2.0.3


 The node is running on xxx.xxx.xxx.xxx. All of a sudden, it was not able to 
 gossip and find the other nodes between data centres. We had two nodes 
 indicated as down in DC1 but those two nodes were up and running in DC2. When 
 we check the two nodes status in DC2, all the nodes in DC1 are denoted as DN 
 and the other node in DC2 is denoted as down. 
 There seems to be a disconnect between the nodes. I have attached the thread 
 dump of the node that was down. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7861) Node is not able to gossip

2014-09-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7861.

Resolution: Cannot Reproduce

 Node is not able to gossip
 --

 Key: CASSANDRA-7861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7861
 Project: Cassandra
  Issue Type: Bug
Reporter: Ananthkumar K S
  Labels: qa-resolved
 Fix For: 2.0.3


 The node is running on xxx.xxx.xxx.xxx. All of a sudden, it was not able to 
 gossip and find the other nodes between data centres. We had two nodes 
 indicated as down in DC1 but those two nodes were up and running in DC2. When 
 we check the two nodes status in DC2, all the nodes in DC1 are denoted as DN 
 and the other node in DC2 is denoted as down. 
 There seems to be a disconnect between the nodes. I have attached the thread 
 dump of the node that was down. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7766) Secondary index not working after a while

2014-09-22 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7766:
---
Description: 
cSince 2.1.0-rc2, it appears that the secondary indexes are not always working. 
Immediately after the INSERT of a row, the index seems to be there. But after a 
while (I do not know when or why), SELECT statements based on any secondary 
index do not return the corresponding row(s) anymore. I noticed that a restart 
of C* may have an impact (the data inserted before the restart may be seen 
through the index, even if it was not returned before the restart).

Here is a use-case example (in order to clarify my request) :
{code}
CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
{code}

The last SELECT statement may or may not return a row depending on the instant 
of the request. I experienced that with 2.1.0-rc5 through CQLSH with clusters 
of one and two nodes. Since it depends on the instant of the request, I am not 
able to deliver any way to reproduce that systematically (It appears to be 
linked with some scheduled job inside C*).


  was:
Since 2.1.0-rc2, it appears that the secondary indexes are not always working. 
Immediately after the INSERT of a row, the index seems to be there. But after a 
while (I do not know when or why), SELECT statements based on any secondary 
index do not return the corresponding row(s) anymore. I noticed that a restart 
of C* may have an impact (the data inserted before the restart may be seen 
through the index, even if it was not returned before the restart).

Here is a use-case example (in order to clarify my request) :
{code}
CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
{code}

The last SELECT statement may or may not return a row depending on the instant 
of the request. I experienced that with 2.1.0-rc5 through CQLSH with clusters 
of one and two nodes. Since it depends on the instant of the request, I am not 
able to deliver any way to reproduce that systematically (It appears to be 
linked with some scheduled job inside C*).



 Secondary index not working after a while
 -

 Key: CASSANDRA-7766
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7766
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.0-rc5 with small clusters (one or two nodes)
Reporter: Fabrice Larcher
 Attachments: result-failure.txt, result-success.txt


 cSince 2.1.0-rc2, it appears that the secondary indexes are not always 
 working. Immediately after the INSERT of a row, the index seems to be there. 
 But after a while (I do not know when or why), SELECT statements based on any 
 secondary index do not return the corresponding row(s) anymore. I noticed 
 that a restart of C* may have an impact (the data inserted before the restart 
 may be seen through the index, even if it was not returned before the 
 restart).
 Here is a use-case example (in order to clarify my request) :
 {code}
 CREATE TABLE IF NOT EXISTS ks.cf ( k int PRIMARY KEY, ind ascii, value text);
 CREATE INDEX IF NOT EXISTS ks_cf_index ON ks.cf(ind);
 INSERT INTO ks.cf (k, ind, value) VALUES (1, 'toto', 'Hello');
 SELECT * FROM ks.cf WHERE ind = 'toto'; // Returns no result after a while
 {code}
 The last SELECT statement may or may not return a row depending on the 
 instant of the request. I experienced that with 2.1.0-rc5 through CQLSH with 
 clusters of one and two nodes. Since it depends on the instant of the 
 request, I am not able to deliver any way to reproduce that systematically 
 (It appears to be linked with some scheduled job inside C*).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7406) Reset version when closing incoming socket in IncomingTcpConnection should be done atomically

2014-09-22 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143972#comment-14143972
 ] 

Shawn Kumar commented on CASSANDRA-7406:


Looks like this particular issue was also brought up and is being looked at in 
7734. It would be greatly appreciated if you could note the version you noticed 
this in. 

 Reset version when closing incoming socket in IncomingTcpConnection should be 
 done atomically
 -

 Key: CASSANDRA-7406
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7406
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: CentOS release 5.5 (Tikanga)
Reporter: Ray Chen

 When closing incoming socket, the close() method will call 
 MessagingService.resetVersion(), this behavior may clear version which is set 
 by another thread.  
 This could cause MessagingService.knowsVersion(endpoint) test results as 
 false (expect true here).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7599) Dtest on low cardinality secondary indexes failing in 2.1

2014-08-18 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7599:
---

Labels: qa-resolved  (was: )

 Dtest on low cardinality secondary indexes failing in 2.1
 -

 Key: CASSANDRA-7599
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7599
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tests
Reporter: Shawn Kumar
Assignee: Tyler Hobbs
  Labels: qa-resolved
 Fix For: 2.1.0

 Attachments: 7599-follow-up.txt, 7599-followup-bikeshed.txt, 7599.txt


 test_low_cardinality_indexes in secondary_indexes_test.py is failing when 
 tested on the cassandra-2.1 branch. This test has been failing on cassci for 
 a while (at least the last 10 builds) and can easily be reproduced locally as 
 well. It appears to still work on 2.0.
 {code}
 ==
 FAIL: test_low_cardinality_indexes 
 (secondary_indexes_test.TestSecondaryIndexes)
 --
 Traceback (most recent call last):
   File /home/shawn/git/cstar5/cassandra-dtest/tools.py, line 213, in wrapped
 f(obj)
   File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, 
 line 89, in test_low_cardinality_indexes
 check_request_order()
   File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, 
 line 84, in check_request_order
 self.assertTrue('Executing indexed scan' in relevant_events[-1][0], 
 str(relevant_events[-1]))
 AssertionError: (u'Enqueuing request to /127.0.0.2', '127.0.0.1')
 {code}
 The test checks that a series of messages are found in the trace after a 
 select query against an index is carried out. It fails to find an 'Executing 
 indexed scan' from node 1 (which takes the query, note both node2 and node3 
 produced this message). Brief investigation seemed to show that whichever 
 node you create the patient_cql_connection on will not produce this message, 
 indicating perhaps it does not carry out the scan.  Should also note that 
 changing 'numrows' (rows initially added) or 'b' (value on index column we 
 query for) does not appear to make a difference.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7486) Compare CMS and G1 pause times

2014-08-18 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7486:
---

Description: 
See 
http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
 and https://twitter.com/rbranson/status/482113561431265281

May want to default 2.1 to G1.

2.1 is a different animal from 2.0 after moving most of memtables off heap.  
Suspect this will help G1 even more than CMS.  (NB this is off by default but 
needs to be part of the test.)

  was:
See 
http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-gc-migration-to-expectations-and-advanced-tuning
 and https://twitter.com/rbranson/status/482113561431265281

May want to default 2.1 to G1.

2.1 is a different animal from 2.0 after moving most of memtables off heap.  
Suspect this will help G1 even more than CMS.  (NB this is off by default but 
needs to be part of the test.)


 Compare CMS and G1 pause times
 --

 Key: CASSANDRA-7486
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7486
 Project: Cassandra
  Issue Type: Test
  Components: Config
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
 Fix For: 2.1.1


 See 
 http://www.slideshare.net/MonicaBeckwith/garbage-first-garbage-collector-g1-7486gc-migration-to-expectations-and-advanced-tuning
  and https://twitter.com/rbranson/status/482113561431265281
 May want to default 2.1 to G1.
 2.1 is a different animal from 2.0 after moving most of memtables off heap.  
 Suspect this will help G1 even more than CMS.  (NB this is off by default but 
 needs to be part of the test.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name

2014-07-31 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-5202:
---

Labels: qa-resolved test  (was: test)

 CFs should have globally and temporally unique CF IDs to prevent reusing 
 data from earlier incarnation of same CF name
 

 Key: CASSANDRA-5202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.9
 Environment: OS: Windows 7, 
 Server: Cassandra 1.1.9 release drop
 Client: astyanax 1.56.21, 
 JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27)
Reporter: Marat Bedretdinov
Assignee: Yuki Morishita
  Labels: qa-resolved, test
 Fix For: 2.1 beta1

 Attachments: 0001-make-2i-CFMetaData-have-parent-s-CF-ID.patch, 
 0002-Don-t-scrub-2i-CF-if-index-type-is-CUSTOM.patch, 
 0003-Fix-user-defined-compaction.patch, 0004-Fix-serialization-test.patch, 
 0005-Create-system_auth-tables-with-fixed-CFID.patch, 0005-auth-v2.txt, 
 5202.txt, astyanax-stress-driver.zip


 Attached is a driver that sequentially:
 1. Drops keyspace
 2. Creates keyspace
 4. Creates 2 column families
 5. Seeds 1M rows with 100 columns
 6. Queries these 2 column families
 The above steps are repeated 1000 times.
 The following exception is observed at random (race - SEDA?):
 ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[ReadStage:55,5,main]
 java.lang.AssertionError: DecoratedKey(-1, ) != 
 DecoratedKey(62819832764241410631599989027761269388, 313a31) in 
 C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
   at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
   at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
   at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164)
   at org.apache.cassandra.db.Table.getRow(Table.java:378)
   at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
   at 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 This exception appears in the server at the time of client submitting a query 
 request (row slice) and not at the time data is seeded. The client times out 
 and this data can no longer be queried as the same exception would always 
 occur from there on.
 Also on iteration 201, it appears that dropping column families failed and as 
 a result their recreation failed with unique column family name violation 
 (see exception below). Note that the data files are actually gone, so it 
 appears that the server runtime responsible for creating column family was 
 out of sync with the piece that dropped them:
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5105 ms
 Iteration: 200; Total running time for 1000 queries is 232; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5361 ms
 Iteration: 201; Total running time for 1000 queries is 222; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Starting creating column families
 Exception in thread main 
 com.netflix.astyanax.connectionpool.exceptions.BadRequestException: 
 BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), 
 

[jira] [Resolved] (CASSANDRA-7501) Test prepared marker for collections inside UDT

2014-07-31 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7501.


Resolution: Implemented

 Test prepared marker for collections inside UDT
 ---

 Key: CASSANDRA-7501
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7501
 Project: Cassandra
  Issue Type: Test
  Components: API
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
Priority: Minor
 Fix For: 2.1.1


 Test for CASSANDRA-7472.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7501) Test prepared marker for collections inside UDT

2014-07-31 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7501:
---

Labels: qa-resolved  (was: )

 Test prepared marker for collections inside UDT
 ---

 Key: CASSANDRA-7501
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7501
 Project: Cassandra
  Issue Type: Test
  Components: API
Reporter: Jonathan Ellis
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.1.1


 Test for CASSANDRA-7472.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7568) Replacing a dead node using replace_address fails

2014-07-31 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7568:
---

Labels: qa-resolved  (was: )

 Replacing a dead node using replace_address fails
 -

 Key: CASSANDRA-7568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7568
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Ala' Alkhaldi
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved

 Failed assertion
 {code}
 ERROR [main] 2014-07-17 10:24:21,171 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.AssertionError: Expected 1 endpoint but found 0
 at 
 org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:222)
  ~[main/:na]
 at 
 org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:131) 
 ~[main/:na]
 at 
 org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:72) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1049)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:811)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:626)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:511)
  ~[main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:338) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 {code}
 To replicate the bug run the replace_address_test.replace_stopped_node_test 
 dtest



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (CASSANDRA-7568) Replacing a dead node using replace_address fails

2014-07-31 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7568.


Resolution: Fixed

ended up having to pass in -Dconsistent.rangemovement=false to make the test 
work.

 Replacing a dead node using replace_address fails
 -

 Key: CASSANDRA-7568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7568
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Ala' Alkhaldi
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved

 Failed assertion
 {code}
 ERROR [main] 2014-07-17 10:24:21,171 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.AssertionError: Expected 1 endpoint but found 0
 at 
 org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:222)
  ~[main/:na]
 at 
 org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:131) 
 ~[main/:na]
 at 
 org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:72) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1049)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:811)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:626)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:511)
  ~[main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:338) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 {code}
 To replicate the bug run the replace_address_test.replace_stopped_node_test 
 dtest



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7599) Dtest on low cardinality secondary indexes failing in 2.1

2014-07-23 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-7599:
--

 Summary: Dtest on low cardinality secondary indexes failing in 2.1
 Key: CASSANDRA-7599
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7599
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Shawn Kumar
 Fix For: 2.1.0


test_low_cardinality_indexes in secondary_indexes_test.py is failing when 
tested on the cassandra-2.1 branch. This test has been failing on cassci for a 
while (at least the last 10 builds) and can easily be reproduced locally as 
well. It appears to still work on 2.0.

{code}
==
FAIL: test_low_cardinality_indexes (secondary_indexes_test.TestSecondaryIndexes)
--
Traceback (most recent call last):
  File /home/shawn/git/cstar5/cassandra-dtest/tools.py, line 213, in wrapped
f(obj)
  File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, line 
89, in test_low_cardinality_indexes
check_request_order()
  File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, line 
84, in check_request_order
self.assertTrue('Executing indexed scan' in relevant_events[-1][0], 
str(relevant_events[-1]))
AssertionError: (u'Enqueuing request to /127.0.0.2', '127.0.0.1')
{code}

The test checks that a series of messages are found in the trace after a select 
query against an index is carried out. It fails to find an 'Executing indexed 
scan' from node 1 (which takes the query, note both node2 and node3 produced 
this message). Brief investigation seemed to show that whichever node you 
create the patient_cql_connection on will not produce this message, indicating 
perhaps it does not carry out the scan.  Should also note that changing 
'numrows' (rows initially added) or 'b' (value on index column we query for) 
does not appear to make a difference.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7599) Dtest on low cardinality secondary indexes failing in 2.1

2014-07-23 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7599:
---

Component/s: Core

 Dtest on low cardinality secondary indexes failing in 2.1
 -

 Key: CASSANDRA-7599
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7599
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tests
Reporter: Shawn Kumar
 Fix For: 2.1.0


 test_low_cardinality_indexes in secondary_indexes_test.py is failing when 
 tested on the cassandra-2.1 branch. This test has been failing on cassci for 
 a while (at least the last 10 builds) and can easily be reproduced locally as 
 well. It appears to still work on 2.0.
 {code}
 ==
 FAIL: test_low_cardinality_indexes 
 (secondary_indexes_test.TestSecondaryIndexes)
 --
 Traceback (most recent call last):
   File /home/shawn/git/cstar5/cassandra-dtest/tools.py, line 213, in wrapped
 f(obj)
   File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, 
 line 89, in test_low_cardinality_indexes
 check_request_order()
   File /home/shawn/git/cstar5/cassandra-dtest/secondary_indexes_test.py, 
 line 84, in check_request_order
 self.assertTrue('Executing indexed scan' in relevant_events[-1][0], 
 str(relevant_events[-1]))
 AssertionError: (u'Enqueuing request to /127.0.0.2', '127.0.0.1')
 {code}
 The test checks that a series of messages are found in the trace after a 
 select query against an index is carried out. It fails to find an 'Executing 
 indexed scan' from node 1 (which takes the query, note both node2 and node3 
 produced this message). Brief investigation seemed to show that whichever 
 node you create the patient_cql_connection on will not produce this message, 
 indicating perhaps it does not carry out the scan.  Should also note that 
 changing 'numrows' (rows initially added) or 'b' (value on index column we 
 query for) does not appear to make a difference.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7140) dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed upgrade from 1.2 to 2.0

2014-06-24 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042661#comment-14042661
 ] 

Shawn Kumar commented on CASSANDRA-7140:


Unable to reproduce, test passes on 2.0.7-2.1.

 dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed 
 upgrade from 1.2 to 2.0
 --

 Key: CASSANDRA-7140
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7140
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
Assignee: Shawn Kumar

 This can be triggered by running the dtest with:
 {noformat}
 nosetests -vs 
 upgrade_through_versions_test:TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD.upgrade_test_mixed}
 {noformat}
 The dtest upgrade test code is a bit more obtuse now, so it takes some more 
 work to see what's happening. It's entirely possible that the dtest is doing 
 the upgrade improperly triggering the exception in cassandra.
 Here's the complete (and very long) stacktrace:
 {noformat}
 upgrade_test_mixed 
 (upgrade_through_versions_test.TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD)
  ... Exception in thread main java.lang.reflect.UndeclaredThrowableException
   at com.sun.proxy.$Proxy0.forceKeyspaceFlush(Unknown Source)
   at 
 org.apache.cassandra.tools.NodeProbe.forceKeyspaceFlush(NodeProbe.java:210)
   at 
 org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1673)
   at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1365)
 Caused by: javax.management.ReflectionException: No such operation: 
 forceKeyspaceFlush
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:170)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
   at 
 sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
   at 
 sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
   at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
   at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
   at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
 Source)
   at 
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
   at 
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
   ... 4 more
 Caused by: java.lang.NoSuchMethodException: 
 forceKeyspaceFlush(java.lang.String, [Ljava.lang.String;)
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:168)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 

[jira] [Resolved] (CASSANDRA-7140) dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed upgrade from 1.2 to 2.0

2014-06-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7140.


Resolution: Cannot Reproduce

 dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed 
 upgrade from 1.2 to 2.0
 --

 Key: CASSANDRA-7140
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7140
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
Assignee: Shawn Kumar
  Labels: qa-resolved

 This can be triggered by running the dtest with:
 {noformat}
 nosetests -vs 
 upgrade_through_versions_test:TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD.upgrade_test_mixed}
 {noformat}
 The dtest upgrade test code is a bit more obtuse now, so it takes some more 
 work to see what's happening. It's entirely possible that the dtest is doing 
 the upgrade improperly triggering the exception in cassandra.
 Here's the complete (and very long) stacktrace:
 {noformat}
 upgrade_test_mixed 
 (upgrade_through_versions_test.TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD)
  ... Exception in thread main java.lang.reflect.UndeclaredThrowableException
   at com.sun.proxy.$Proxy0.forceKeyspaceFlush(Unknown Source)
   at 
 org.apache.cassandra.tools.NodeProbe.forceKeyspaceFlush(NodeProbe.java:210)
   at 
 org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1673)
   at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1365)
 Caused by: javax.management.ReflectionException: No such operation: 
 forceKeyspaceFlush
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:170)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
   at 
 sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
   at 
 sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
   at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
   at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
   at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
 Source)
   at 
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
   at 
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
   ... 4 more
 Caused by: java.lang.NoSuchMethodException: 
 forceKeyspaceFlush(java.lang.String, [Ljava.lang.String;)
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:168)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 

[jira] [Updated] (CASSANDRA-7140) dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed upgrade from 1.2 to 2.0

2014-06-24 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7140:
---

Labels: qa-resolved  (was: )

 dtest triggers java.lang.reflect.UndeclaredThrowableException on mixed 
 upgrade from 1.2 to 2.0
 --

 Key: CASSANDRA-7140
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7140
 Project: Cassandra
  Issue Type: Bug
Reporter: Russ Hatch
Assignee: Shawn Kumar
  Labels: qa-resolved

 This can be triggered by running the dtest with:
 {noformat}
 nosetests -vs 
 upgrade_through_versions_test:TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD.upgrade_test_mixed}
 {noformat}
 The dtest upgrade test code is a bit more obtuse now, so it takes some more 
 work to see what's happening. It's entirely possible that the dtest is doing 
 the upgrade improperly triggering the exception in cassandra.
 Here's the complete (and very long) stacktrace:
 {noformat}
 upgrade_test_mixed 
 (upgrade_through_versions_test.TestUpgrade_from_cassandra_1_2_latest_tag_to_cassandra_2_0_HEAD)
  ... Exception in thread main java.lang.reflect.UndeclaredThrowableException
   at com.sun.proxy.$Proxy0.forceKeyspaceFlush(Unknown Source)
   at 
 org.apache.cassandra.tools.NodeProbe.forceKeyspaceFlush(NodeProbe.java:210)
   at 
 org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:1673)
   at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1365)
 Caused by: javax.management.ReflectionException: No such operation: 
 forceKeyspaceFlush
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:170)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
   at 
 sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
   at 
 sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
   at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
   at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
   at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown 
 Source)
   at 
 javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1029)
   at 
 javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292)
   ... 4 more
 Caused by: java.lang.NoSuchMethodException: 
 forceKeyspaceFlush(java.lang.String, [Ljava.lang.String;)
   at 
 com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:168)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:112)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
   at 
 

[jira] [Commented] (CASSANDRA-7350) Decommissioning nodes borks the seed node - can't add additional nodes

2014-06-19 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038001#comment-14038001
 ] 

Shawn Kumar commented on CASSANDRA-7350:


Was unable to reproduce this, looks like it's been fixed. 

 Decommissioning nodes borks the seed node - can't add additional nodes
 --

 Key: CASSANDRA-7350
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7350
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu using the auto-clustering AMI
Reporter: Steven Lowenthal
Assignee: Shawn Kumar
Priority: Minor
 Fix For: 2.0.9


 1) Launch a 4 node cluster - I used the auto-clustering AMI (you get nodes 
 0-3)
 2) decommission that last 2 nodes (nodes , leaving a 2 node cluster)
 3) wipe the data directories from node 2
 4) bootstrap node2 - it won't join unable to gossip with any seeds.
 If you bootstrap the node a second time, it will join.  However if you try to 
 bootstrap node 3, it will also fail.
 I discovered that bouncing the seed node fixes the problem.  I think it 
 cropped up in 2.0.7.
 Error:
 ERROR [main] 2014-06-03 21:52:46,649 CassandraDaemon.java (line 497) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
   at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
   at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:447)
   at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:656)
   at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
   at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)
 ERROR [StorageServiceShutdownHook] 2014-06-03 21:52:46,741 
 CassandraDaemon.java (line 198) Exception in thread Thread[StorageServi



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7350) Decommissioning nodes borks the seed node - can't add additional nodes

2014-06-19 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7350:
---

Labels: qa-resolved  (was: )

 Decommissioning nodes borks the seed node - can't add additional nodes
 --

 Key: CASSANDRA-7350
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7350
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu using the auto-clustering AMI
Reporter: Steven Lowenthal
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.0.9


 1) Launch a 4 node cluster - I used the auto-clustering AMI (you get nodes 
 0-3)
 2) decommission that last 2 nodes (nodes , leaving a 2 node cluster)
 3) wipe the data directories from node 2
 4) bootstrap node2 - it won't join unable to gossip with any seeds.
 If you bootstrap the node a second time, it will join.  However if you try to 
 bootstrap node 3, it will also fail.
 I discovered that bouncing the seed node fixes the problem.  I think it 
 cropped up in 2.0.7.
 Error:
 ERROR [main] 2014-06-03 21:52:46,649 CassandraDaemon.java (line 497) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
   at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
   at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:447)
   at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:656)
   at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
   at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)
 ERROR [StorageServiceShutdownHook] 2014-06-03 21:52:46,741 
 CassandraDaemon.java (line 198) Exception in thread Thread[StorageServi



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-5202) CFs should have globally and temporally unique CF IDs to prevent reusing data from earlier incarnation of same CF name

2014-06-05 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-5202:
---

Tester: Shawn Kumar

 CFs should have globally and temporally unique CF IDs to prevent reusing 
 data from earlier incarnation of same CF name
 

 Key: CASSANDRA-5202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5202
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.9
 Environment: OS: Windows 7, 
 Server: Cassandra 1.1.9 release drop
 Client: astyanax 1.56.21, 
 JVM: Sun/Oracle JVM 64 bit (jdk1.6.0_27)
Reporter: Marat Bedretdinov
Assignee: Yuki Morishita
  Labels: test
 Fix For: 2.1 beta1

 Attachments: 0001-make-2i-CFMetaData-have-parent-s-CF-ID.patch, 
 0002-Don-t-scrub-2i-CF-if-index-type-is-CUSTOM.patch, 
 0003-Fix-user-defined-compaction.patch, 0004-Fix-serialization-test.patch, 
 0005-Create-system_auth-tables-with-fixed-CFID.patch, 0005-auth-v2.txt, 
 5202.txt, astyanax-stress-driver.zip


 Attached is a driver that sequentially:
 1. Drops keyspace
 2. Creates keyspace
 4. Creates 2 column families
 5. Seeds 1M rows with 100 columns
 6. Queries these 2 column families
 The above steps are repeated 1000 times.
 The following exception is observed at random (race - SEDA?):
 ERROR [ReadStage:55] 2013-01-29 19:24:52,676 AbstractCassandraDaemon.java 
 (line 135) Exception in thread Thread[ReadStage:55,5,main]
 java.lang.AssertionError: DecoratedKey(-1, ) != 
 DecoratedKey(62819832764241410631599989027761269388, 313a31) in 
 C:\var\lib\cassandra\data\user_role_reverse_index\business_entity_role\user_role_reverse_index-business_entity_role-hf-1-Data.db
   at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60)
   at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
   at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
   at 
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
   at 
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1367)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1229)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1164)
   at org.apache.cassandra.db.Table.getRow(Table.java:378)
   at 
 org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
   at 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:822)
   at 
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1271)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 This exception appears in the server at the time of client submitting a query 
 request (row slice) and not at the time data is seeded. The client times out 
 and this data can no longer be queried as the same exception would always 
 occur from there on.
 Also on iteration 201, it appears that dropping column families failed and as 
 a result their recreation failed with unique column family name violation 
 (see exception below). Note that the data files are actually gone, so it 
 appears that the server runtime responsible for creating column family was 
 out of sync with the piece that dropped them:
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5105 ms
 Iteration: 200; Total running time for 1000 queries is 232; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Dropped column families
 Starting dropping keyspace
 Dropped keyspace
 Starting creating column families
 Created column families
 Starting seeding data
 Total rows inserted: 100 in 5361 ms
 Iteration: 201; Total running time for 1000 queries is 222; Average running 
 time of 1000 queries is 0 ms
 Starting dropping column families
 Starting creating column families
 Exception in thread main 
 com.netflix.astyanax.connectionpool.exceptions.BadRequestException: 
 BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=2468(2469), 
 attempts=1]InvalidRequestException(why:Keyspace names must be 
 

[jira] [Updated] (CASSANDRA-5351) Avoid repairing already-repaired data by default

2014-06-03 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-5351:
---

Labels: qa-resolved repair  (was: repair)

 Avoid repairing already-repaired data by default
 

 Key: CASSANDRA-5351
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Lyuben Todorov
  Labels: qa-resolved, repair
 Fix For: 2.1 beta1

 Attachments: 0001-Incremental-repair-wip.patch, 
 0001-keep-repairedAt-time-when-scrubbing-and-no-bad-rows-.patch, 
 5351_node1.log, 5351_node2.log, 5351_node3.log, 5351_nodetool.log


 Repair has always built its merkle tree from all the data in a columnfamily, 
 which is guaranteed to work but is inefficient.
 We can improve this by remembering which sstables have already been 
 successfully repaired, and only repairing sstables new since the last repair. 
  (This automatically makes CASSANDRA-3362 much less of a problem too.)
 The tricky part is, compaction will (if not taught otherwise) mix repaired 
 data together with non-repaired.  So we should segregate unrepaired sstables 
 from the repaired ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7141) Expand secondary_indexes_test for secondary indexes on sets and maps

2014-05-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-7141:
---

Labels: qa-resolved  (was: )

 Expand secondary_indexes_test for secondary indexes on sets and maps
 

 Key: CASSANDRA-7141
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7141
 Project: Cassandra
  Issue Type: Test
Reporter: Shawn Kumar
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.1 rc1


 Secondary_indexes_test.py only checks functionality of secondary indexes on 
 lists as of now. This should be expanded to all collections, including maps 
 and sets.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (CASSANDRA-7141) Expand secondary_indexes_test for secondary indexes on sets and maps

2014-05-27 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7141.


Resolution: Fixed

 Expand secondary_indexes_test for secondary indexes on sets and maps
 

 Key: CASSANDRA-7141
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7141
 Project: Cassandra
  Issue Type: Test
Reporter: Shawn Kumar
Assignee: Shawn Kumar
Priority: Minor
  Labels: qa-resolved
 Fix For: 2.1 rc1


 Secondary_indexes_test.py only checks functionality of secondary indexes on 
 lists as of now. This should be expanded to all collections, including maps 
 and sets.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-5351) Avoid repairing already-repaired data by default

2014-05-05 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar updated CASSANDRA-5351:
---

Tester: Shawn Kumar

 Avoid repairing already-repaired data by default
 

 Key: CASSANDRA-5351
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jonathan Ellis
Assignee: Lyuben Todorov
  Labels: repair
 Fix For: 2.1 beta1

 Attachments: 0001-Incremental-repair-wip.patch, 
 0001-keep-repairedAt-time-when-scrubbing-and-no-bad-rows-.patch, 
 5351_node1.log, 5351_node2.log, 5351_node3.log, 5351_nodetool.log


 Repair has always built its merkle tree from all the data in a columnfamily, 
 which is guaranteed to work but is inefficient.
 We can improve this by remembering which sstables have already been 
 successfully repaired, and only repairing sstables new since the last repair. 
  (This automatically makes CASSANDRA-3362 much less of a problem too.)
 The tricky part is, compaction will (if not taught otherwise) mix repaired 
 data together with non-repaired.  So we should segregate unrepaired sstables 
 from the repaired ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (CASSANDRA-7109) Create replace_address dtest

2014-05-02 Thread Shawn Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Kumar resolved CASSANDRA-7109.


Resolution: Fixed

 Create replace_address dtest
 

 Key: CASSANDRA-7109
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7109
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Shawn Kumar
 Fix For: 2.1 beta2


 {noformat}
 16:03  driftx well, this just bothers me because either it's been broken 
 for almost ever, or something 
 broke in cassandra.
 16:43  thobbs driftx: I'm testing your patch on #6622, but I'm seeing a bit 
 of a weird error:
 16:43  CassBotJr https://issues.apache.org/jira/browse/CASSANDRA-6622 
 (Unresolved; 1.2.16, 2.0.6): 
Streaming session failures during node replace of same 
 address
 16:43  thobbs java.lang.UnsupportedOperationException: Cannot replace token 
 -1017822742317066613 which does 
 not exist!
 16:44  thobbs this is on 2.0 with the patch applied
 16:44  driftx O_o
 16:44  thobbs I'm just stopping a ccm node, clearing it, then starting with 
 replace_address (auto_bootstrap 
 = true, not a seed, initial_tokens is null)
 16:45  driftx oh, I'm stupid, hang on
 16:47  rcoli is the sum of that that replace_* is still broken in 2.0 ?
 16:47  rcoli err, 1.2?
 16:48  driftx thobbs: updated the patch
 16:48  thobbs rcoli: only for replacing the same address
 16:48  rcoli is there another case?
 16:49  driftx replacing with a different address.
 16:49  rcoli oh, right, _address_
 16:49  rcoli I'm still modeling this as replace _token_
 16:49  rcoli in my brain
 16:49  driftx same address never broke for me though, so you can probably 
 just retry
 16:55  thobbs can we add a dtest for replace_address coverage?  It's kind 
 of annoying to test manually and 
 we've managed to break it a few times
 16:56  thobbs I have a PR against ccm open to add replace_address support: 
 https://github.com/pcmanus/ccm/pull/85
 16:57  driftx I could have sworn we had one
 16:58  driftx we do but it's using replace_token so probably not even 
 running now
 16:58  thobbs yeah
 16:58  thobbs it would be nice to cover replacing the same address, another 
 address, and expected failures 
 like replacing a still-live node
 16:59  driftx +1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7141) Expand secondary_indexes_test for secondary indexes on sets and maps

2014-05-02 Thread Shawn Kumar (JIRA)
Shawn Kumar created CASSANDRA-7141:
--

 Summary: Expand secondary_indexes_test for secondary indexes on 
sets and maps
 Key: CASSANDRA-7141
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7141
 Project: Cassandra
  Issue Type: Test
Reporter: Shawn Kumar
Assignee: Shawn Kumar
Priority: Minor
 Fix For: 2.1 rc1


Secondary_indexes_test.py only checks functionality of secondary indexes on 
lists as of now. This should be expanded to all collections, including maps and 
sets.



--
This message was sent by Atlassian JIRA
(v6.2#6252)