[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs

2020-10-06 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209300#comment-17209300
 ] 

Yifan Cai commented on CASSANDRA-16120:
---

Realized that the {{FileLogAction.java}} file is missing the license text 
header in all branches including trunk... 
Otherwise, LGTM and +1 on the backports. 

> Add ability for jvm-dtest to grep instance logs
> ---
>
> Key: CASSANDRA-16120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16120
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> One of the main gaps between python dtest and jvm dtest is python dtest 
> supports the ability to grep the logs of an instance; we need this capability 
> as some tests require validating logs were triggered.
> Pydocs for common log methods 
> {code}
> |  grep_log(self, expr, filename='system.log', from_mark=None)
> |  Returns a list of lines matching the regular expression in parameter
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors(self, filename='system.log')
> |  Returns a list of errors with stack traces
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors_from(self, filename='system.log', seek_start=0)
> {code}
> {code}
> |  watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
> verbose=False, filename='system.log')
> |  Watch the log until one or more (regular) expression are found.
> |  This methods when all the expressions have been found or the method
> |  timeouts (a TimeoutError is then raised). On successful completion,
> |  a list of pair (line matched, match object) is returned.
> {code}
> Below is a POC showing a way to do such logic
> {code}
> package org.apache.cassandra.distributed.test;
> import java.io.BufferedReader;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStreamReader;
> import java.io.UncheckedIOException;
> import java.nio.charset.StandardCharsets;
> import java.util.Iterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.stream.Stream;
> import java.util.stream.StreamSupport;
> import com.google.common.io.Closeables;
> import org.junit.Test;
> import org.apache.cassandra.distributed.Cluster;
> import org.apache.cassandra.utils.AbstractIterator;
> public class AllTheLogs extends TestBaseImpl
> {
>@Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>String log = String.format("build/test/logs/%s/TEST-%s.log", tag, 
> suite);
>grep(log, "Enqueuing flush of tables").forEach(l -> 
> System.out.println("I found the thing: " + l));
>}
>}
>private static Stream grep(String file, String regex) throws 
> IOException
>{
>return grep(file, Pattern.compile(regex));
>}
>private static Stream grep(String file, Pattern regex) throws 
> IOException
>{
>BufferedReader reader = new BufferedReader(new InputStreamReader(new 
> FileInputStream(file), StandardCharsets.UTF_8));
>Iterator it = new AbstractIterator()
>{
>protected String computeNext()
>{
>try
>{
>String s;
>while ((s = reader.readLine()) != null)
>{
>Matcher m = regex.matcher(s);
>if (m.find())
>return s;
>}
>reader.close();
>return endOfData();
>}
>catch (IOException e)
>{
>Closeables.closeQuietly(reader);
>throw new UncheckedIOException(e);
>}
>}
>};
>return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 
> Spliterator.ORDERED), false);
>}
> }
> {code}
> And
> {code}
> @Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
> 

[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2020-10-06 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209288#comment-17209288
 ] 

Caleb Rackliffe commented on CASSANDRA-15234:
-

bq. In the spirit of expediting 4.0RC release I propose we postpone this to 
4.X, and resume this with high priority earlier in the next release cycle.

I'm more or less in agreement that we push this to 4.x along with 
CASSANDRA-16038, unless some unrelated need for the latter arises.

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-alpha, 4.0-triage
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16195) Fix flaky test test_expiration_overflow_policy_cap - ttl_test.TestTTL

2020-10-06 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209281#comment-17209281
 ] 

Brandon Williams commented on CASSANDRA-16195:
--

Dupe of CASSANDRA-15996?

> Fix flaky test test_expiration_overflow_policy_cap - ttl_test.TestTTL
> -
>
> Key: CASSANDRA-16195
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16195
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> ttl_test.py:410: AssertionError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14157) [DTEST] [TRUNK] test_tracing_does_not_interfere_with_digest_calculation - cql_tracing_test.TestCqlTracing failed once : AssertionError: assert 0 == 1

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14157:
-
Fix Version/s: (was: 4.0-triage)

> [DTEST] [TRUNK] test_tracing_does_not_interfere_with_digest_calculation - 
> cql_tracing_test.TestCqlTracing failed once : AssertionError: assert 0 == 1
> -
>
> Key: CASSANDRA-14157
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14157
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta3
>
>
> test_tracing_does_not_interfere_with_digest_calculation - 
> cql_tracing_test.TestCqlTracing failed it's assertion once today in a 
> circleci run. the dtests were running against trunk.
> Although it has failed once so far, a quick read of the comments in the test 
> seems to indicate that the assertion failing this way might mean that 
> CASSANDRA-13964 didn't fully fix the issue.
> {code:python}
> if jmx.has_mbean(rr_count):
> # expect 0 digest mismatches
> >   assert 0 == jmx.read_attribute(rr_count, 'Count')
> E   AssertionError: assert 0 == 1
> E+  where 1 =   0x7f62d4156898>>('org.apache.cassandra.metrics:type=ReadRepair,name=RepairedBlocking',
>  'Count')
> E+where  > = 
> .read_attribute
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14030:
-
Fix Version/s: (was: 4.0-triage)

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> {code}
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14688:
-
Fix Version/s: (was: 4.0-triage)

> Update protocol spec and class level doc with protocol checksumming details
> ---
>
> Key: CASSANDRA-14688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14688
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-rc
>
>
> CASSANDRA-13304 provides an option to add checksumming to the frame body of 
> native protocol messages. The native protocol spec needs to be updated to 
> reflect this ASAP. We should also verify that the javadoc comments describing 
> the on-wire format in 
> {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14793:
-
Fix Version/s: (was: 4.0-triage)

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14753) Document incremental repair session timeouts and repair_admin usage

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14753:
-
Fix Version/s: (was: 4.0-triage)

> Document incremental repair session timeouts and repair_admin usage
> ---
>
> Key: CASSANDRA-14753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14753
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Low
> Fix For: 4.0
>
>
> As seen in CASSANDRA-14685, the behavior of incremental repair sessions with 
> failed streams is not obvious and appears to be a bug (although it's working 
> as expected). The incremental repair documentation should be updated to 
> describe what happens if an incremental repair session fails mid-stream, the 
> session timeouts, and how and when to use nodetool repair_admin. The sstable 
> acquisition error should also be updated to mention this as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15158:
-
Fix Version/s: (was: 4.0-triage)

> Wait for schema agreement rather than in flight schema requests when 
> bootstrapping
> --
>
> Key: CASSANDRA-15158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip, Cluster/Schema
>Reporter: Vincent White
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches 
> (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of 
> in-flight schema pull requests, and we don't proceed with 
> bootstrapping/stream until all the latches are released (or we timeout 
> waiting for each one). One issue with this is that if we have a large schema, 
> or the retrieval of the schema from the other nodes was unexpectedly slow 
> then we have no explicit check in place to ensure we have actually received a 
> schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the 
> node to wait on each latche longer, there are cases where this doesn't help 
> because the callbacks for the schema pull requests have expired off the 
> messaging service's callback map 
> (org.apache.cassandra.net.MessagingService#callbacks) after 
> request_timeout_in_ms (default 10 seconds) before the other nodes were able 
> to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the 
> rest of the live nodes before proceeding with bootstrapping. It also adds a 
> check to prevent the new node from flooding existing nodes with simultaneous 
> schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters 
> getting stuck for extended amounts of time as they wait 
> `migration_task_wait_in_seconds` on each of the latches left orphaned by the 
> timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14834) Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the whole compaction

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14834:
-
Fix Version/s: (was: 4.0-triage)

> Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the 
> whole compaction
> 
>
> Key: CASSANDRA-14834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 4.0, 4.0-beta
>
>
> Since CASSANDRA-13444 {{StreamingTombstoneHistogramBuilder.Spool}} is 
> allocated to keep around an array with 131072 * 2 * 2 integers *per written 
> sstable* during the whole compaction. With LCS at times creating 1000s of 
> sstables during a compaction it kills the node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15369) Fake row deletions and range tombstones, causing digest mismatch and sstable growth

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15369:
-
Fix Version/s: (was: 4.0-triage)

> Fake row deletions and range tombstones, causing digest mismatch and sstable 
> growth
> ---
>
> Key: CASSANDRA-15369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15369
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/Memtable, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>
> As assessed in CASSANDRA-15363, we generate fake row deletions and fake 
> tombstone markers under various circumstances:
>  * If we perform a clustering key query (or select a compact column):
>  * Serving from a {{Memtable}}, we will generate fake row deletions
>  * Serving from an sstable, we will generate fake row tombstone markers
>  * If we perform a slice query, we will generate only fake row tombstone 
> markers for any range tombstone that begins or ends outside of the limit of 
> the requested slice
>  * If we perform a multi-slice or IN query, this will occur for each 
> slice/clustering
> Unfortunately, these different behaviours can lead to very different data 
> stored in sstables until a full repair is run.  When we read-repair, we only 
> send these fake deletions or range tombstones.  A fake row deletion, 
> clustering RT and slice RT, each produces a different digest.  So for each 
> single point lookup we can produce a digest mismatch twice, and until a full 
> repair is run we can encounter an unlimited number of digest mismatches 
> across different overlapping queries.
> Relatedly, this seems a more problematic variant of our atomicity failures 
> caused by our monotonic reads, since RTs can have an atomic effect across (up 
> to) the entire partition, whereas the propagation may happen on an 
> arbitrarily small portion.  If the RT exists on only one node, this could 
> plausibly lead to fairly problematic scenario if that node fails before the 
> range can be repaired. 
> At the very least, this behaviour can lead to an almost unlimited amount of 
> extraneous data being stored until the range is repaired and compaction 
> happens to overwrite the sub-range RTs and row deletions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15313) Fix flaky - ChecksummingTransformerTest - org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15313:
-
Fix Version/s: (was: 4.0-triage)

> Fix flaky - ChecksummingTransformerTest - 
> org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest
> ---
>
> Key: CASSANDRA-15313
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15313
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Vinay Chella
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-15313-hack.patch
>
>
> During the recent runs, this test appears to be flaky.
> Example failure: 
> [https://circleci.com/gh/vinaykumarchella/cassandra/459#tests/containers/94]
> corruptionCausesFailure-compression - 
> org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest
> {code:java}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
>   at org.quicktheories.impl.Precursor.(Precursor.java:17)
>   at 
> org.quicktheories.impl.ConcreteDetachedSource.(ConcreteDetachedSource.java:8)
>   at 
> org.quicktheories.impl.ConcreteDetachedSource.detach(ConcreteDetachedSource.java:23)
>   at org.quicktheories.generators.Retry.generate(CodePoints.java:51)
>   at 
> org.quicktheories.generators.Generate.lambda$intArrays$10(Generate.java:190)
>   at 
> org.quicktheories.generators.Generate$$Lambda$17/1847008471.generate(Unknown 
> Source)
>   at org.quicktheories.core.DescribingGenerator.generate(Gen.java:255)
>   at org.quicktheories.core.Gen.lambda$map$0(Gen.java:36)
>   at org.quicktheories.core.Gen$$Lambda$20/71399214.generate(Unknown 
> Source)
>   at org.quicktheories.core.Gen.lambda$map$0(Gen.java:36)
>   at org.quicktheories.core.Gen$$Lambda$20/71399214.generate(Unknown 
> Source)
>   at org.quicktheories.core.Gen.lambda$mix$10(Gen.java:184)
>   at org.quicktheories.core.Gen$$Lambda$45/802243390.generate(Unknown 
> Source)
>   at org.quicktheories.core.Gen.lambda$flatMap$5(Gen.java:93)
>   at org.quicktheories.core.Gen$$Lambda$48/363509958.generate(Unknown 
> Source)
>   at 
> org.quicktheories.dsl.TheoryBuilder4.lambda$prgnToTuple$12(TheoryBuilder4.java:188)
>   at 
> org.quicktheories.dsl.TheoryBuilder4$$Lambda$40/2003496028.generate(Unknown 
> Source)
>   at org.quicktheories.core.DescribingGenerator.generate(Gen.java:255)
>   at org.quicktheories.core.FilteredGenerator.generate(Gen.java:225)
>   at org.quicktheories.core.Gen.lambda$map$0(Gen.java:36)
>   at org.quicktheories.core.Gen$$Lambda$20/71399214.generate(Unknown 
> Source)
>   at org.quicktheories.impl.Core.generate(Core.java:150)
>   at org.quicktheories.impl.Core.shrink(Core.java:103)
>   at org.quicktheories.impl.Core.run(Core.java:39)
>   at org.quicktheories.impl.TheoryRunner.check(TheoryRunner.java:35)
>   at org.quicktheories.dsl.TheoryBuilder4.check(TheoryBuilder4.java:150)
>   at 
> org.quicktheories.dsl.TheoryBuilder4.checkAssert(TheoryBuilder4.java:162)
>   at 
> org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest.corruptionCausesFailure(ChecksummingTransformerTest.java:87)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15299:
-
Fix Version/s: (was: 4.0-triage)

> CASSANDRA-13304 follow-up: improve checksumming and compression in protocol 
> v5-beta
> ---
>
> Key: CASSANDRA-15299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Aleksey Yeschenko
>Assignee: Alex Petrov
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-alpha
>
>
> CASSANDRA-13304 made an important improvement to our native protocol: it 
> introduced checksumming/CRC32 to request and response bodies. It’s an 
> important step forward, but it doesn’t cover the entire stream. In 
> particular, the message header is not covered by a checksum or a crc, which 
> poses a correctness issue if, for example, {{streamId}} gets corrupted.
> Additionally, we aren’t quite using CRC32 correctly, in two ways:
> 1. We are calculating the CRC32 of the *decompressed* value instead of 
> computing the CRC32 on the bytes written on the wire - losing the properties 
> of the CRC32. In some cases, due to this sequencing, attempting to decompress 
> a corrupt stream can cause a segfault by LZ4.
> 2. When using CRC32, the CRC32 value is written in the incorrect byte order, 
> also losing some of the protections.
> See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for 
> explanation for the two points above.
> Separately, there are some long-standing issues with the protocol - since 
> *way* before CASSANDRA-13304. Importantly, both checksumming and compression 
> operate on individual message bodies rather than frames of multiple complete 
> messages. In reality, this has several important additional downsides. To 
> name a couple:
> # For compression, we are getting poor compression ratios for smaller 
> messages - when operating on tiny sequences of bytes. In reality, for most 
> small requests and responses we are discarding the compressed value as it’d 
> be smaller than the uncompressed one - incurring both redundant allocations 
> and compressions.
> # For checksumming and CRC32 we pay a high overhead price for small messages. 
> 4 bytes extra is *a lot* for an empty write response, for example.
> To address the correctness issue of {{streamId}} not being covered by the 
> checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we 
> should switch to a framing protocol with multiple messages in a single frame.
> I suggest we reuse the framing protocol recently implemented for internode 
> messaging in CASSANDRA-15066 to the extent that its logic can be borrowed, 
> and that we do it before native protocol v5 graduates from beta. See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java
>  and 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15536) 4.0 Quality: Components and Test Plans

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15536:
-
Fix Version/s: (was: 4.0-triage)

> 4.0 Quality: Components and Test Plans
> --
>
> Key: CASSANDRA-15536
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15536
> Project: Cassandra
>  Issue Type: Epic
>  Components: Test/benchmark, Test/dtest/python, Test/fuzz, Test/unit
>Reporter: Josh McKenzie
>Assignee: Josh McKenzie
>Priority: High
> Fix For: 4.0
>
>
> [Source doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#].
> Jira migrated from 
> [cwiki|https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality:+Components+and+Test+Plans]
>  The overarching goal of the 4.0 release is that Cassandra 4.0 should be at a 
> state where major users would run it in production when it is cut. To gain 
> this confidence there are various ongoing testing efforts involving 
> correctness, performance, and ease of use. In this page we try to coordinate 
> and identify blockers for subsystems before we can release 4.0
> For each component we strive to have shepherds and contributors involved. 
> Shepherds should be committers or knowledgeable component owners and are 
> responsible for driving their blocking tickets to completion and ensuring 
> quality in their claimed area, while contributors have signed up to help 
> verify that subsystem by running tests or contributing fixes. Shepherds also 
> ideally help set testing standards and ensure that we meet a high standard of 
> quality in their claimed area.
> If you are interested in contributing to testing 4.0, please add your name as 
> assignee if you want to drive things, reviewer if just participate and 
> review, and get involved in the the tracking ticket, and dev list/IRC 
> discussions involving that component.
> h3. Targeted Components / Subsystems
> We've tried to collect some of the major components or subsystems that we 
> want to ensure work properly towards having a great 4.0 release. If you think 
> something is missing please add it. Better yet volunteer to contribute to 
> testing it!
> h4. Internode Messaging
> In 4.0 we're getting a new Netty based inter-node communication system 
> (CASSANDRA-8457). As internode messaging is vital to the correctness and 
> performance of the database we should make sure that all forms (TLS, 
> compressed, low latency, high latency, etc ...) of internode messaging 
> function correctly.
> h4. Test Infrastructure / Automation: Diff Testing
> Diff testing is a form of model-based testing in which two clusters are 
> exhaustively compared to assert identity. To support Apache Cassandra 4.0 
> validation, contributors have developed cassandra-diff. This is a Spark 
> application that distributes the token range over a configurable number of 
> Spark executors, then parallelizes randomized forward and reverse reads with 
> varying paging sizes to read and compare every row present in the cluster, 
> persisting a record of mismatches for investigation. This methodology has 
> been instrumental to identifying data loss, data corruption, and incorrect 
> response issues introduced in early Cassandra 3.0 releases.
> cassandra-diff and associated documentation can be found at: 
> [https://github.com/apache/cassandra-diff]. Contributors are encouraged to 
> run diff tests against clusters they manage and report issues to ensure 
> workload diversity across the project.
> h4. System Tables and Internal Schema
> This task covers a review of and minor bug fixes to local and distributed 
> system keyspaces. Planned work in this area is now complete.
> h4. Source Audit and Performance Testing: Streaming
> This task covers an audit of the Streaming implementation in Apache Cassandra 
> 4.0. In this release, contributors have implemented full-SSTable streaming to 
> improve performance and reduce memory pressure. Internode messaging changes 
> implemented in CASSANDRA-15066 adjacent to streaming suggested that review of 
> the streaming implementation itself may be desirable. Prior work also covered 
> performance testing of full-SSTable streaming.
> h4. Test Infrastructure / Automation: "Harry"
> CASSANDRA-15348 - Harry: generator library and extensible framework for fuzz 
> testing Apache Cassandra TRIAGE NEEDED
> Harry is a component for fuzz testing and verification of the Apache 
> Cassandra clusters at scale. Harry allows to run tests that are able to 
> validate state of both dense nodes (to test local read-write path) and large 
> clusters (to test distributed read-write path), and do it efficiently. Harry 
> defines a model that holds the state of the database, generators that produce 
> reproducible, 

[jira] [Updated] (CASSANDRA-15865) Flaky dtest hintedhandoff_test.py::TestHintedHandoffConfig::test_hintedhandoff_setmaxwindow

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15865:
-
Fix Version/s: (was: 4.0-triage)

> Flaky dtest 
> hintedhandoff_test.py::TestHintedHandoffConfig::test_hintedhandoff_setmaxwindow
> ---
>
> Key: CASSANDRA-15865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Sam Tunnicliffe
>Assignee: Charles Attwood Thomas
>Priority: Normal
> Fix For: 4.0-beta
>
>
> I've seen this fail a couple of times under JDK11, when it doesn't appear to 
> be related to the changes under test.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15962) Digest for some queries is different depending whether the data are retrieved from sstable or memtable

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15962:
-
Fix Version/s: (was: 4.0-triage)

> Digest for some queries is different depending whether the data are retrieved 
> from sstable or memtable
> --
>
> Key: CASSANDRA-15962
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15962
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.0, 3.11.x
>
> Attachments: DigestTest.java
>
>
> Not sure into which category should I assign this ticket.
>  
> Basically when reading using certain column filters, the digest is different 
> depending whether we read from sstable and memtable. This happens on 
> {{trunk}} and {{cassandra-3.11}} branches. However it works properly on 
> {{cassandra-3.0}} branch.
>  
> I'm attaching a simple test for trunk to demonstrate what I mean. 
>  
> Please verify my test and my conclusions
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15903) Doc update: stream-entire-sstable supports all compaction strategies and internode encryption

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15903:
-
Fix Version/s: (was: 4.0-triage)

> Doc update: stream-entire-sstable supports all compaction strategies and 
> internode encryption
> -
>
> Key: CASSANDRA-15903
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15903
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Zhao Yang
>Priority: Normal
> Fix For: 4.0
>
>
> As [~mck] point out, doc needs to be updated for CASSANDRA-15657  and 
> CASSANDRA-15740.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15997) TestBootstrap::test_cleanup failing on unexpected number of SSTables

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15997:
-
Fix Version/s: (was: 4.0-triage)

> TestBootstrap::test_cleanup failing on unexpected number of SSTables
> 
>
> Key: CASSANDRA-15997
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15997
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 4.0-beta
>
>
> This failure has now occurred in a number of places on trunk (4.0), including 
> both Java 8 and 11 dtest runs. Nominally, there appear to be more SSTables 
> after cleanup than the test is expecting.
> {noformat}
> if len(sstables) > basecount + jobs:
> logger.debug("Current count is {}, basecount was 
> {}".format(len(sstables), basecount))
> failed.set()
> {noformat}
> Examples:
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/c59be4f8-329e-4d76-9c59-d49c38e58dd2/jobs/448
> https://app.circleci.com/pipelines/github/jolynch/cassandra/20/workflows/9d6c3b86-6207-4ead-aa4b-79022fc84182/jobs/893



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15996) Fix flaky python dtest test_expiration_overflow_policy_capnowarn - ttl_test.TestTTL

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-15996:
-
Fix Version/s: (was: 4.0-triage)

> Fix flaky python dtest test_expiration_overflow_policy_capnowarn - 
> ttl_test.TestTTL
> ---
>
> Key: CASSANDRA-15996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15996
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/361/workflows/3a42fa45-1f60-4c95-86a4-15a6773e384e/jobs/1860
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16048) Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and Value Columns

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16048:
-
Fix Version/s: (was: 4.0-triage)

> Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and 
> Value Columns
> --
>
> Key: CASSANDRA-16048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Some compact storage tables, specifically those where the user has defined 
> both at least one clustering and the value column, can be safely handled in 
> 4.0 because besides the DENSE flag they are not materially different post 3.0 
> and there is no visible change to the user facing schema after dropping 
> compact storage. We can detect this case and allow these tables to silently 
> drop the DENSE flag while still throwing a start-up error for COMPACT STORAGE 
> tables that don’t meet the criteria. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16061) transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16061:
-
Fix Version/s: (was: 4.0-triage)

> transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_and_cleanup
> 
>
> Key: CASSANDRA-16061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failing here, also locally:
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/312/workflows/da4ce69c-e778-467e-b9f3-27ab166a8321/jobs/1945]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16074) Add metric for client concurrent byte throttle

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16074:
-
Fix Version/s: (was: 4.0-triage)

> Add metric for client concurrent byte throttle
> --
>
> Key: CASSANDRA-16074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16074
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Add a metric to expose the current bytes and bytes per ip used that is used 
> in the existing throttle so its possible to determine what to set it to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16113) Consolidate dead nodes check in force repair

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16113:
-
Fix Version/s: (was: 4.0-triage)

> Consolidate dead nodes check in force repair
> 
>
> Key: CASSANDRA-16113
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16113
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The check for dead nodes during force repair is duplicated in the normal and 
> incremental repair. We could consolidate those 2 checks to make the code more 
> dry. 
> The check should throw a more meaningful error message to indicate that all 
> neighbor nodes are down, instead of "java.lang.IllegalArgumentException: 
> Endpoints can not be empty"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16135) Separate in-JVM test into smaller packages

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16135:
-
Fix Version/s: (was: 4.0-triage)

> Separate in-JVM test into smaller packages
> --
>
> Key: CASSANDRA-16135
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16135
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: High
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Introduce a structure similar to how tags are organised in Cassandra Jira for 
> corresponding in-jvm dtests to help people find a right place for their tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16143) Streaming fails when s SSTable writer finish() exceeds internode_tcp_user_timeout

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16143:
-
Fix Version/s: (was: 4.0-triage)

> Streaming fails when s SSTable writer finish() exceeds 
> internode_tcp_user_timeout
> -
>
> Key: CASSANDRA-16143
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16143
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>
> tl;dr The internode TCP user timeout that provides more responsive detection 
> of dead nodes for internode message will cause streaming to fail if system 
> calls to fsync/fdatasync exceed the timeout (default 30s).
> To workaround, explicitly set internode_tcp_user_timeout to longer than 
> fsync/fdatasync, or to zero to revert to the operating system default.
> Details:
> While bootstrapping a replacement 4.0beta3 node in an existing cluster, 
> bootstrap streaming repeatedly failed with the streaming follower logging
> {code:java}
> ERROR 2020-09-10T14:29:34,711 [NettyStreaming-Outbound-1.1.1.1.7000:1] 
> org.apache.cassandra.streaming.StreamSession:693 - [Stream 
> #7cb67c00-f3ac-11ea-b940-f7836f164528] Streaming error occurred on session 
> with peer 1.1.1.1:7000
> org.apache.cassandra.net.AsyncChannelOutputPlus$FlushException: The channel 
> this output stream was writing to has been closed
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.propagateFailedFlush(AsyncChannelOutputPlus.java:200)
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.waitUntilFlushed(AsyncChannelOutputPlus.java:158)
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.waitForSpace(AsyncChannelOutputPlus.java:140)
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.beginFlush(AsyncChannelOutputPlus.java:97)
>at 
> org.apache.cassandra.net.AsyncStreamingOutputPlus.lambda$writeToChannel$0(AsyncStreamingOutputPlus.java:142)
>at 
> org.apache.cassandra.db.streaming.CassandraCompressedStreamWriter.lambda$write$0(CassandraCompressedStreamWriter.java:90)
>at 
> org.apache.cassandra.net.AsyncStreamingOutputPlus.writeToChannel(AsyncStreamingOutputPlus.java:138)
>at 
> org.apache.cassandra.db.streaming.CassandraCompressedStreamWriter.write(CassandraCompressedStreamWriter.java:89)
>at 
> org.apache.cassandra.db.streaming.CassandraOutgoingFile.write(CassandraOutgoingFile.java:180)
>at 
> org.apache.cassandra.streaming.messages.OutgoingStreamMessage.serialize(OutgoingStreamMessage.java:87)
>at 
> org.apache.cassandra.streaming.messages.OutgoingStreamMessage$1.serialize(OutgoingStreamMessage.java:45)
>at 
> org.apache.cassandra.streaming.messages.OutgoingStreamMessage$1.serialize(OutgoingStreamMessage.java:34)
>at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:40)
>at 
> org.apache.cassandra.streaming.async.NettyStreamingMessageSender$FileStreamTask.run(NettyStreamingMessageSender.java:347)
>at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
>Suppressed: java.nio.channels.ClosedChannelException
>at 
> org.apache.cassandra.net.AsyncStreamingOutputPlus.doFlush(AsyncStreamingOutputPlus.java:78)
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.flush(AsyncChannelOutputPlus.java:229)
>at 
> org.apache.cassandra.net.AsyncChannelOutputPlus.close(AsyncChannelOutputPlus.java:248)
>at 
> org.apache.cassandra.streaming.async.NettyStreamingMessageSender$FileStreamTask.run(NettyStreamingMessageSender.java:348)
>at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>at java.util.concurrent.FutureTask.run(FutureTask.java:264) 
> [?:?]
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

[jira] [Updated] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-10-06 Thread C. Scott Andreas (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-16144:
-
Fix Version/s: (was: 4.0-triage)

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:938)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> 

[jira] [Comment Edited] (CASSANDRA-15249) Add documentation on release lifecycle

2020-10-06 Thread Sumanth Pasupuleti (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209267#comment-17209267
 ] 

Sumanth Pasupuleti edited comment on CASSANDRA-15249 at 10/7/20, 3:10 AM:
--

I do not think there is anything left to do here, given that the content has 
been moved to the wiki


was (Author: sumanth.pasupuleti):
I do not think there is anything left to here, given that the content has been 
moved to the wiki

> Add documentation on release lifecycle
> --
>
> Key: CASSANDRA-15249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15249
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
> Attachments: release_lifecycle.patch
>
>
> Relevant dev list mail thread: 
> https://lists.apache.org/thread.html/1a768d057d1af5a0f373c4c399a23e65cb04c61bbfff612634b9437c@%3Cdev.cassandra.apache.org%3E
> Cassandra wiki on release lifecycle - 
> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
> [Legacy - this google doc content is now moved to above cwiki; keeping google 
> doc link here for preserving comments history] Google doc with community 
> collaboration on documenting release lifecycle 
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15249) Add documentation on release lifecycle

2020-10-06 Thread Sumanth Pasupuleti (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209267#comment-17209267
 ] 

Sumanth Pasupuleti commented on CASSANDRA-15249:


I do not think there is anything left to here, given that the content has been 
moved to the wiki

> Add documentation on release lifecycle
> --
>
> Key: CASSANDRA-15249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15249
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
> Attachments: release_lifecycle.patch
>
>
> Relevant dev list mail thread: 
> https://lists.apache.org/thread.html/1a768d057d1af5a0f373c4c399a23e65cb04c61bbfff612634b9437c@%3Cdev.cassandra.apache.org%3E
> Cassandra wiki on release lifecycle - 
> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
> [Legacy - this google doc content is now moved to above cwiki; keeping google 
> doc link here for preserving comments history] Google doc with community 
> collaboration on documenting release lifecycle 
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15472) Read failure due to exception from metrics-core dependency

2020-10-06 Thread Sumanth Pasupuleti (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumanth Pasupuleti updated CASSANDRA-15472:
---
Fix Version/s: (was: 4.0-triage)

> Read failure due to exception from metrics-core dependency
> --
>
> Key: CASSANDRA-15472
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15472
> Project: Cassandra
>  Issue Type: Bug
>  Components: Dependencies
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>
> Stacktrace
> {code:java}
> Uncaught exception on thread Thread[SharedPool-Worker-27,5,main]: {}
> java.util.NoSuchElementException: null
>   at 
> java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2053)
>  ~[na:1.8.0_222]
>   at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:102)
>  ~[metrics-core-2.2.0.jar:na]
>   at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81)
>  ~[metrics-core-2.2.0.jar:na]
>   at com.yammer.metrics.core.Histogram.update(Histogram.java:110) 
> ~[metrics-core-2.2.0.jar:na]
>   at com.yammer.metrics.core.Timer.update(Timer.java:198) 
> ~[metrics-core-2.2.0.jar:na]
>   at com.yammer.metrics.core.Timer.update(Timer.java:76) 
> ~[metrics-core-2.2.0.jar:na]
>   at 
> org.apache.cassandra.metrics.LatencyMetrics.addNano(LatencyMetrics.java:108) 
> ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> org.apache.cassandra.metrics.LatencyMetrics.addNano(LatencyMetrics.java:114) 
> ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1897)
>  ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353) 
> ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>  ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) 
> ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_222]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [nf-cassandra-2.1.19.10.jar:2.1.19.10]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_222]
> {code}
> This [issue|https://github.com/dropwizard/metrics/issues/1278] has been 
> [fixed|https://github.com/dropwizard/metrics/pull/1436] in 
> [v4.0.6|https://github.com/dropwizard/metrics/releases/tag/v4.0.6].
> This is observed on a 2.1.19 cluster, but this would impact pretty much any 
> version of C* since we depend on lower versions of metrics-core that do not 
> have the fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209261#comment-17209261
 ] 

David Capwell commented on CASSANDRA-16120:
---

Backports:

3.11: https://github.com/dcapwell/cassandra/tree/backport/CASSANDRA-16120
3.0: https://github.com/dcapwell/cassandra/tree/backport/CASSANDRA-16120-3.0
2.2: https://github.com/dcapwell/cassandra/tree/backport/CASSANDRA-16120-2.2

> Add ability for jvm-dtest to grep instance logs
> ---
>
> Key: CASSANDRA-16120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16120
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> One of the main gaps between python dtest and jvm dtest is python dtest 
> supports the ability to grep the logs of an instance; we need this capability 
> as some tests require validating logs were triggered.
> Pydocs for common log methods 
> {code}
> |  grep_log(self, expr, filename='system.log', from_mark=None)
> |  Returns a list of lines matching the regular expression in parameter
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors(self, filename='system.log')
> |  Returns a list of errors with stack traces
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors_from(self, filename='system.log', seek_start=0)
> {code}
> {code}
> |  watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
> verbose=False, filename='system.log')
> |  Watch the log until one or more (regular) expression are found.
> |  This methods when all the expressions have been found or the method
> |  timeouts (a TimeoutError is then raised). On successful completion,
> |  a list of pair (line matched, match object) is returned.
> {code}
> Below is a POC showing a way to do such logic
> {code}
> package org.apache.cassandra.distributed.test;
> import java.io.BufferedReader;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStreamReader;
> import java.io.UncheckedIOException;
> import java.nio.charset.StandardCharsets;
> import java.util.Iterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.stream.Stream;
> import java.util.stream.StreamSupport;
> import com.google.common.io.Closeables;
> import org.junit.Test;
> import org.apache.cassandra.distributed.Cluster;
> import org.apache.cassandra.utils.AbstractIterator;
> public class AllTheLogs extends TestBaseImpl
> {
>@Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>String log = String.format("build/test/logs/%s/TEST-%s.log", tag, 
> suite);
>grep(log, "Enqueuing flush of tables").forEach(l -> 
> System.out.println("I found the thing: " + l));
>}
>}
>private static Stream grep(String file, String regex) throws 
> IOException
>{
>return grep(file, Pattern.compile(regex));
>}
>private static Stream grep(String file, Pattern regex) throws 
> IOException
>{
>BufferedReader reader = new BufferedReader(new InputStreamReader(new 
> FileInputStream(file), StandardCharsets.UTF_8));
>Iterator it = new AbstractIterator()
>{
>protected String computeNext()
>{
>try
>{
>String s;
>while ((s = reader.readLine()) != null)
>{
>Matcher m = regex.matcher(s);
>if (m.find())
>return s;
>}
>reader.close();
>return endOfData();
>}
>catch (IOException e)
>{
>Closeables.closeQuietly(reader);
>throw new UncheckedIOException(e);
>}
>}
>};
>return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 
> Spliterator.ORDERED), false);
>}
> }
> {code}
> And
> {code}
> @Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>   

[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209258#comment-17209258
 ] 

David Capwell commented on CASSANDRA-16120:
---

I forgot to backport this, so will do that now...

> Add ability for jvm-dtest to grep instance logs
> ---
>
> Key: CASSANDRA-16120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16120
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> One of the main gaps between python dtest and jvm dtest is python dtest 
> supports the ability to grep the logs of an instance; we need this capability 
> as some tests require validating logs were triggered.
> Pydocs for common log methods 
> {code}
> |  grep_log(self, expr, filename='system.log', from_mark=None)
> |  Returns a list of lines matching the regular expression in parameter
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors(self, filename='system.log')
> |  Returns a list of errors with stack traces
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors_from(self, filename='system.log', seek_start=0)
> {code}
> {code}
> |  watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
> verbose=False, filename='system.log')
> |  Watch the log until one or more (regular) expression are found.
> |  This methods when all the expressions have been found or the method
> |  timeouts (a TimeoutError is then raised). On successful completion,
> |  a list of pair (line matched, match object) is returned.
> {code}
> Below is a POC showing a way to do such logic
> {code}
> package org.apache.cassandra.distributed.test;
> import java.io.BufferedReader;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStreamReader;
> import java.io.UncheckedIOException;
> import java.nio.charset.StandardCharsets;
> import java.util.Iterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.stream.Stream;
> import java.util.stream.StreamSupport;
> import com.google.common.io.Closeables;
> import org.junit.Test;
> import org.apache.cassandra.distributed.Cluster;
> import org.apache.cassandra.utils.AbstractIterator;
> public class AllTheLogs extends TestBaseImpl
> {
>@Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>String log = String.format("build/test/logs/%s/TEST-%s.log", tag, 
> suite);
>grep(log, "Enqueuing flush of tables").forEach(l -> 
> System.out.println("I found the thing: " + l));
>}
>}
>private static Stream grep(String file, String regex) throws 
> IOException
>{
>return grep(file, Pattern.compile(regex));
>}
>private static Stream grep(String file, Pattern regex) throws 
> IOException
>{
>BufferedReader reader = new BufferedReader(new InputStreamReader(new 
> FileInputStream(file), StandardCharsets.UTF_8));
>Iterator it = new AbstractIterator()
>{
>protected String computeNext()
>{
>try
>{
>String s;
>while ((s = reader.readLine()) != null)
>{
>Matcher m = regex.matcher(s);
>if (m.find())
>return s;
>}
>reader.close();
>return endOfData();
>}
>catch (IOException e)
>{
>Closeables.closeQuietly(reader);
>throw new UncheckedIOException(e);
>}
>}
>};
>return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 
> Spliterator.ORDERED), false);
>}
> }
> {code}
> And
> {code}
> @Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>//TODO missing way to get node id
> //cluster.get(1);
>String log 

[jira] [Updated] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16120:
--
Status: Open  (was: Resolved)

> Add ability for jvm-dtest to grep instance logs
> ---
>
> Key: CASSANDRA-16120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16120
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> One of the main gaps between python dtest and jvm dtest is python dtest 
> supports the ability to grep the logs of an instance; we need this capability 
> as some tests require validating logs were triggered.
> Pydocs for common log methods 
> {code}
> |  grep_log(self, expr, filename='system.log', from_mark=None)
> |  Returns a list of lines matching the regular expression in parameter
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors(self, filename='system.log')
> |  Returns a list of errors with stack traces
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors_from(self, filename='system.log', seek_start=0)
> {code}
> {code}
> |  watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
> verbose=False, filename='system.log')
> |  Watch the log until one or more (regular) expression are found.
> |  This methods when all the expressions have been found or the method
> |  timeouts (a TimeoutError is then raised). On successful completion,
> |  a list of pair (line matched, match object) is returned.
> {code}
> Below is a POC showing a way to do such logic
> {code}
> package org.apache.cassandra.distributed.test;
> import java.io.BufferedReader;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStreamReader;
> import java.io.UncheckedIOException;
> import java.nio.charset.StandardCharsets;
> import java.util.Iterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.stream.Stream;
> import java.util.stream.StreamSupport;
> import com.google.common.io.Closeables;
> import org.junit.Test;
> import org.apache.cassandra.distributed.Cluster;
> import org.apache.cassandra.utils.AbstractIterator;
> public class AllTheLogs extends TestBaseImpl
> {
>@Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>String log = String.format("build/test/logs/%s/TEST-%s.log", tag, 
> suite);
>grep(log, "Enqueuing flush of tables").forEach(l -> 
> System.out.println("I found the thing: " + l));
>}
>}
>private static Stream grep(String file, String regex) throws 
> IOException
>{
>return grep(file, Pattern.compile(regex));
>}
>private static Stream grep(String file, Pattern regex) throws 
> IOException
>{
>BufferedReader reader = new BufferedReader(new InputStreamReader(new 
> FileInputStream(file), StandardCharsets.UTF_8));
>Iterator it = new AbstractIterator()
>{
>protected String computeNext()
>{
>try
>{
>String s;
>while ((s = reader.readLine()) != null)
>{
>Matcher m = regex.matcher(s);
>if (m.find())
>return s;
>}
>reader.close();
>return endOfData();
>}
>catch (IOException e)
>{
>Closeables.closeQuietly(reader);
>throw new UncheckedIOException(e);
>}
>}
>};
>return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 
> Spliterator.ORDERED), false);
>}
> }
> {code}
> And
> {code}
> @Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>//TODO missing way to get node id
> //cluster.get(1);
>String log = 
> 

[jira] [Updated] (CASSANDRA-16012) sstablesplit unit test hardening

2020-10-06 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16012:
--
Status: Changes Suggested  (was: Review In Progress)

> sstablesplit unit test hardening
> 
>
> Key: CASSANDRA-16012
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16012
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 4.0-beta
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16012) sstablesplit unit test hardening

2020-10-06 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16012:
--
Reviewers: Yifan Cai, Yifan Cai  (was: Yifan Cai)
   Yifan Cai, Yifan Cai
   Status: Review In Progress  (was: Patch Available)

> sstablesplit unit test hardening
> 
>
> Key: CASSANDRA-16012
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16012
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 4.0-beta
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15392:

Resolution: Won't Fix
Status: Resolved  (was: Open)

Closing as won't fix due to effort involved in fixing/quantifying 
MergeIterator.get performance regression relative to gc improvement for 4.0

> Pool Merge Iterators
> 
>
> Key: CASSANDRA-15392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15392
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
>
> By pooling merge iterators, instead of creating new ones each time we need 
> them, we can reduce garbage on the compaction and read paths under relevant 
> workloads by ~4% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15387) Reduce compaction & local read path garbage

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15387:

Resolution: Fixed
Status: Resolved  (was: Open)

> Reduce compaction & local read path garbage
> ---
>
> Key: CASSANDRA-15387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15387
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
>
> There are several opportunities to significantly reduce the amount of garbage 
> generated by compaction and the local read path. This will serve as a top 
> level jira for related changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15391:

Resolution: Won't Fix
Status: Resolved  (was: Open)

Closing as won't fix due to effort involved in fixing/quantifying megamorphic 
concerns relative to improvement for 4.0

> Reduce heap footprint of commonly allocated objects
> ---
>
> Key: CASSANDRA-15391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15391
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
>
> BufferCell, BTreeRow, and Clustering make up a significant amount of 
> allocations during reads/compactions, and many of the fields of these classes 
> are often unused. For example, the CellPath reference in BufferCell is only 
> every used for collection columns. Since we know which fields will and won’t 
> be used during cell creation, we can define specialized classes that only 
> take up heap space for the data they’ll be using. This reduces compaction 
> garbage by up to 4.5%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15392:

Status: In Progress  (was: Changes Suggested)

> Pool Merge Iterators
> 
>
> Key: CASSANDRA-15392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15392
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
>
> By pooling merge iterators, instead of creating new ones each time we need 
> them, we can reduce garbage on the compaction and read paths under relevant 
> workloads by ~4% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15391:

Status: Open  (was: Patch Available)

> Reduce heap footprint of commonly allocated objects
> ---
>
> Key: CASSANDRA-15391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15391
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0, 4.0-triage
>
>
> BufferCell, BTreeRow, and Clustering make up a significant amount of 
> allocations during reads/compactions, and many of the fields of these classes 
> are often unused. For example, the CellPath reference in BufferCell is only 
> every used for collection columns. Since we know which fields will and won’t 
> be used during cell creation, we can define specialized classes that only 
> take up heap space for the data they’ll be using. This reduces compaction 
> garbage by up to 4.5%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16196) Fix flaky test test_disk_balance_after_boundary_change_lcs - disk_balance_test.TestDiskBalance

2020-10-06 Thread David Capwell (Jira)
David Capwell created CASSANDRA-16196:
-

 Summary: Fix flaky test 
test_disk_balance_after_boundary_change_lcs - disk_balance_test.TestDiskBalance
 Key: CASSANDRA-16196
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16196
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest/python
Reporter: David Capwell


https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514

{code}
error_message = '' if 'error_message' not in kwargs else 
kwargs['error_message']
assert vmin > vmax * (1.0 - error) or vmin == vmax, \
>   "values not within {:.2f}% of the max: {} ({})".format(error * 100, 
> args, error_message)
E   AssertionError: values not within 10.00% of the max: (8022760, 9192165, 
4575645, 9235566, 9091014) (node2)

tools/assertions.py:206: AssertionError
{code}

Marking as distinct issue after chat in CASSANDRA-14030



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16196) Fix flaky test test_disk_balance_after_boundary_change_lcs - disk_balance_test.TestDiskBalance

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16196:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Normal
Discovered By: Unit Test
Fix Version/s: 4.0-beta
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Fix flaky test test_disk_balance_after_boundary_change_lcs - 
> disk_balance_test.TestDiskBalance
> --
>
> Key: CASSANDRA-16196
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16196
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514
> {code}
> error_message = '' if 'error_message' not in kwargs else 
> kwargs['error_message']
> assert vmin > vmax * (1.0 - error) or vmin == vmax, \
> >   "values not within {:.2f}% of the max: {} ({})".format(error * 
> > 100, args, error_message)
> E   AssertionError: values not within 10.00% of the max: (8022760, 
> 9192165, 4575645, 9235566, 9091014) (node2)
> tools/assertions.py:206: AssertionError
> {code}
> Marking as distinct issue after chat in CASSANDRA-14030



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209199#comment-17209199
 ] 

David Capwell commented on CASSANDRA-14030:
---

cool, will open new issue then.

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> {code}
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16194:
---
Status: Ready to Commit  (was: Changes Suggested)

+1

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16012) sstablesplit unit test hardening

2020-10-06 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209189#comment-17209189
 ] 

Yifan Cai commented on CASSANDRA-16012:
---

Left comments inline inside the GH PR. 

> sstablesplit unit test hardening
> 
>
> Key: CASSANDRA-16012
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16012
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 4.0-beta
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209187#comment-17209187
 ] 

Jordan West commented on CASSANDRA-16194:
-

Updated

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16194:

Attachment: jwest-gpg-key.patch

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16194:

Attachment: (was: jwest-gpg-key.patch)

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15996) Fix flaky python dtest test_expiration_overflow_policy_capnowarn - ttl_test.TestTTL

2020-10-06 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209178#comment-17209178
 ] 

Adam Holmberg commented on CASSANDRA-15996:
---

David encountered a failure in a variant of this test on trunk:
https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514
 

Adding 4.0 fixver back on.

> Fix flaky python dtest test_expiration_overflow_policy_capnowarn - 
> ttl_test.TestTTL
> ---
>
> Key: CASSANDRA-15996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15996
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta, 4.0-triage
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/361/workflows/3a42fa45-1f60-4c95-86a4-15a6773e384e/jobs/1860
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15996) Fix flaky python dtest test_expiration_overflow_policy_capnowarn - ttl_test.TestTTL

2020-10-06 Thread Adam Holmberg (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg updated CASSANDRA-15996:
--
Fix Version/s: 4.0-triage
   4.0-beta

> Fix flaky python dtest test_expiration_overflow_policy_capnowarn - 
> ttl_test.TestTTL
> ---
>
> Key: CASSANDRA-15996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15996
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta, 4.0-triage
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/361/workflows/3a42fa45-1f60-4c95-86a4-15a6773e384e/jobs/1860
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-06 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209177#comment-17209177
 ] 

Adam Holmberg commented on CASSANDRA-14030:
---

Are you referring to the failure that's similar to CASSANDRA-16089?
If so, we should open a new ticket.

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> {code}
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209174#comment-17209174
 ] 

Paulo Motta edited comment on CASSANDRA-16182 at 10/6/20, 9:27 PM:
---

bq. This doesn't really work if any other node has already replaced C with C'.

C' would not complete replacement if it detects C presence before then.

This wouldn't prevent C from reappearing to other nodes *after* replacement is 
completed, but would at least prevent the reported scenario gracefully.


was (Author: pauloricardomg):
bq. This doesn't really work if any other node has already replaced C with C'.

C' would not complete replacement if it detects C presence before then.

This wouldn't prevent C from reappearing to other nodes *after* replacement is 
completed, but would at least prevent the reported scenario.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209174#comment-17209174
 ] 

Paulo Motta commented on CASSANDRA-16182:
-

> This doesn't really work if any other node has already replaced C with C'.

C' would not complete replacement if it detects C presence before then.

This wouldn't prevent C from reappearing to other nodes *after* replacement is 
completed, but would at least prevent the reported scenario.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209174#comment-17209174
 ] 

Paulo Motta edited comment on CASSANDRA-16182 at 10/6/20, 9:26 PM:
---

bq. This doesn't really work if any other node has already replaced C with C'.

C' would not complete replacement if it detects C presence before then.

This wouldn't prevent C from reappearing to other nodes *after* replacement is 
completed, but would at least prevent the reported scenario.


was (Author: pauloricardomg):
> This doesn't really work if any other node has already replaced C with C'.

C' would not complete replacement if it detects C presence before then.

This wouldn't prevent C from reappearing to other nodes *after* replacement is 
completed, but would at least prevent the reported scenario.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209167#comment-17209167
 ] 

Benedict Elliott Smith commented on CASSANDRA-16182:


{quote}I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about C via gossip.
{quote}
This doesn't really work if any other node has already replaced C with C'.
{quote}this is truly edge case and bad timing
{quote}
I agree it is a rare scenario, and an operator should be able to rectify it - 
even if it is a potentially serious event. My personal preference is to shelve 
this until we overhaul cluster membership, hopefully for 5.0.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16195) Fix flaky test test_expiration_overflow_policy_cap - ttl_test.TestTTL

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16195:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Normal
Discovered By: Unit Test
Fix Version/s: 4.0-beta
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Fix flaky test test_expiration_overflow_policy_cap - ttl_test.TestTTL
> -
>
> Key: CASSANDRA-16195
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16195
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> ttl_test.py:410: AssertionError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16195) Fix flaky test test_expiration_overflow_policy_cap - ttl_test.TestTTL

2020-10-06 Thread David Capwell (Jira)
David Capwell created CASSANDRA-16195:
-

 Summary: Fix flaky test test_expiration_overflow_policy_cap - 
ttl_test.TestTTL
 Key: CASSANDRA-16195
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16195
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest/python
Reporter: David Capwell


https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514

{code}
>   assert warning, 'Log message should be print for CAP and CAP_NOWARN 
> policy'
E   AssertionError: Log message should be print for CAP and CAP_NOWARN 
policy
E   assert []

ttl_test.py:410: AssertionError
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16057:
--
  Fix Version/s: NA
Source Control Link: 
https://github.com/apache/cassandra/commit/83e1e9e45193322f18f57aa7cc4ad31d9d5a152d
  (was: https://github.com/apache/cassandra/pull/749)
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Should update in-jvm dtest to expose stdout and stderr for nodetool
> ---
>
> Key: CASSANDRA-16057
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16057
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Many nodetool commands output to stdout or stderr so running nodetool using 
> in-jvm dtest should expose that to tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208863#comment-17208863
 ] 

David Capwell edited comment on CASSANDRA-16057 at 10/6/20, 8:49 PM:
-

Committed: Yellow, known broken or flaky tests.

CI results

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16057-trunk-EBBE5F61-8D2B-48B6-BD18-8D842BCC9E57
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/70/


was (Author: dcapwell):
Starting commit (pending):

CI results

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16057-trunk-EBBE5F61-8D2B-48B6-BD18-8D842BCC9E57
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/70/

> Should update in-jvm dtest to expose stdout and stderr for nodetool
> ---
>
> Key: CASSANDRA-16057
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16057
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Many nodetool commands output to stdout or stderr so running nodetool using 
> in-jvm dtest should expose that to tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: in-jvm dtest now exposes stdout and stderr for nodetool

2020-10-06 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 83e1e9e  in-jvm dtest now exposes stdout and stderr for nodetool
83e1e9e is described below

commit 83e1e9e45193322f18f57aa7cc4ad31d9d5a152d
Author: Yifan Cai 
AuthorDate: Tue Oct 6 08:54:17 2020 -0700

in-jvm dtest now exposes stdout and stderr for nodetool

patch by Yifan Cai; reviewed by Alex Petrov, David Capwell for 
CASSANDRA-16057
---
 src/java/org/apache/cassandra/tools/NodeProbe.java | 16 +++-
 src/java/org/apache/cassandra/tools/NodeTool.java  | 51 +++-
 .../tools/{nodetool/Version.java => Output.java}   | 22 +++---
 .../cassandra/tools/nodetool/BootstrapResume.java  |  4 +-
 .../apache/cassandra/tools/nodetool/Cleanup.java   |  2 +-
 .../cassandra/tools/nodetool/ClearSnapshot.java|  4 +-
 .../cassandra/tools/nodetool/ClientStats.java  | 24 +++---
 .../tools/nodetool/CompactionHistory.java  |  2 +-
 .../cassandra/tools/nodetool/CompactionStats.java  | 18 +++--
 .../cassandra/tools/nodetool/DescribeCluster.java  | 50 ++--
 .../cassandra/tools/nodetool/DescribeRing.java |  8 +-
 .../tools/nodetool/FailureDetectorInfo.java|  4 +-
 .../cassandra/tools/nodetool/GarbageCollect.java   |  2 +-
 .../apache/cassandra/tools/nodetool/GcStats.java   |  6 +-
 .../tools/nodetool/GetBatchlogReplayTrottle.java   |  4 +-
 .../tools/nodetool/GetCompactionThreshold.java |  8 +-
 .../tools/nodetool/GetCompactionThroughput.java|  4 +-
 .../cassandra/tools/nodetool/GetConcurrency.java   | 10 +--
 .../tools/nodetool/GetConcurrentCompactors.java|  4 +-
 .../tools/nodetool/GetConcurrentViewBuilders.java  |  4 +-
 .../cassandra/tools/nodetool/GetEndpoints.java |  4 +-
 .../cassandra/tools/nodetool/GetFullQueryLog.java  |  2 +-
 .../tools/nodetool/GetInterDCStreamThroughput.java |  2 +-
 .../cassandra/tools/nodetool/GetLoggingLevels.java |  6 +-
 .../cassandra/tools/nodetool/GetMaxHintWindow.java |  4 +-
 .../cassandra/tools/nodetool/GetSSTables.java  |  4 +-
 .../apache/cassandra/tools/nodetool/GetSeeds.java  |  6 +-
 .../tools/nodetool/GetStreamThroughput.java|  4 +-
 .../cassandra/tools/nodetool/GetTimeout.java   |  2 +-
 .../tools/nodetool/GetTraceProbability.java|  2 +-
 .../cassandra/tools/nodetool/GossipInfo.java   |  2 +-
 .../apache/cassandra/tools/nodetool/Import.java| 11 ++-
 .../org/apache/cassandra/tools/nodetool/Info.java  | 42 +-
 .../cassandra/tools/nodetool/ListSnapshots.java| 10 ++-
 .../apache/cassandra/tools/nodetool/NetStats.java  | 92 +++---
 .../cassandra/tools/nodetool/ProfileLoad.java  | 17 ++--
 .../cassandra/tools/nodetool/ProxyHistograms.java  | 13 +--
 .../cassandra/tools/nodetool/RangeKeySample.java   |  6 +-
 .../apache/cassandra/tools/nodetool/Refresh.java   |  4 +-
 .../cassandra/tools/nodetool/ReloadSeeds.java  |  8 +-
 .../cassandra/tools/nodetool/RemoveNode.java   |  4 +-
 .../apache/cassandra/tools/nodetool/Repair.java|  2 +-
 .../cassandra/tools/nodetool/RepairAdmin.java  | 30 ---
 .../org/apache/cassandra/tools/nodetool/Ring.java  | 52 ++--
 .../org/apache/cassandra/tools/nodetool/Scrub.java |  2 +-
 .../cassandra/tools/nodetool/SetConcurrency.java   |  2 +-
 .../org/apache/cassandra/tools/nodetool/Sjk.java   | 83 +++
 .../apache/cassandra/tools/nodetool/Snapshot.java  | 12 +--
 .../apache/cassandra/tools/nodetool/Status.java| 22 +++---
 .../tools/nodetool/StatusAutoCompaction.java   |  8 +-
 .../cassandra/tools/nodetool/StatusBackup.java |  4 +-
 .../cassandra/tools/nodetool/StatusBinary.java |  4 +-
 .../cassandra/tools/nodetool/StatusGossip.java |  4 +-
 .../cassandra/tools/nodetool/StatusHandoff.java|  6 +-
 .../cassandra/tools/nodetool/TableHistograms.java  | 18 +++--
 .../cassandra/tools/nodetool/TableStats.java   |  2 +-
 .../apache/cassandra/tools/nodetool/TpStats.java   |  4 +-
 .../cassandra/tools/nodetool/UpgradeSSTable.java   |  4 +-
 .../apache/cassandra/tools/nodetool/Verify.java|  8 +-
 .../apache/cassandra/tools/nodetool/Version.java   |  4 +-
 .../cassandra/tools/nodetool/ViewBuildStatus.java  | 10 ++-
 .../tools/nodetool/formatter/TableBuilder.java |  2 +-
 .../tools/nodetool/stats/TpStatsPrinter.java   |  6 +-
 .../cassandra/distributed/impl/Instance.java   | 60 --
 .../shared/NodeToolResultWithOutput.java   |  2 +
 .../cassandra/distributed/test/NodeToolTest.java   | 15 +++-
 .../cassandra/distributed/util/NodetoolUtils.java  |  7 +-
 .../apache/cassandra/tools/nodetool/SjkTest.java   | 17 ++--
 .../apache/cassandra/stress/CompactionStress.java  |  2 +-
 69 files changed, 512 insertions(+), 372 deletions(-)

diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 

[jira] [Commented] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209155#comment-17209155
 ] 

David Capwell commented on CASSANDRA-14030:
---

Not 100% sure if 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514
 is the same issue or not; should I open a different JIRA for this?

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> {code}
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-14030:
--
Description: 
disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: 
['127.0.0.5.* now UP']:

{code}
15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
.
See system.log for remainder
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
dtest: DEBUG: Done setting configuration options:
{   'initial_token': None,
'num_tokens': '32',
'phi_convict_threshold': 5,
'range_request_timeout_in_ms': 1,
'read_request_timeout_in_ms': 1,
'request_timeout_in_ms': 1,
'truncate_request_timeout_in_ms': 1,
'write_request_timeout_in_ms': 1}
- >> end captured logging << -
  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
disk_balance_bootstrap_test
node5.start(wait_for_binary_proto=True)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 706, in start
node.watch_log_for_alive(self, from_mark=mark)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 520, in watch_log_for_alive
self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
filename=filename)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 488, in watch_log_for
raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " [" 
+ self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
reads[:50] + ".\nSee {} for remainder".format(filename))
"15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
system.log for remainder\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
/tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
5,\n'range_request_timeout_in_ms': 1,\n
'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n
'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
1}\n- >> end captured logging << -"
{code}

  was:
disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: 
['127.0.0.5.* now UP']:

15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
.
See system.log for remainder
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
dtest: DEBUG: Done setting configuration options:
{   'initial_token': None,
'num_tokens': '32',
'phi_convict_threshold': 5,
'range_request_timeout_in_ms': 1,
'read_request_timeout_in_ms': 1,
'request_timeout_in_ms': 1,
'truncate_request_timeout_in_ms': 1,
'write_request_timeout_in_ms': 1}
- >> end captured logging << -
  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
disk_balance_bootstrap_test
node5.start(wait_for_binary_proto=True)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 706, in start
node.watch_log_for_alive(self, from_mark=mark)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 520, in watch_log_for_alive
self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
filename=filename)
  File "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", 
line 488, in watch_log_for
raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " [" 
+ self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
reads[:50] + ".\nSee {} for remainder".format(filename))
"15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
system.log for remainder\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
/tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
5,\n'range_request_timeout_in_ms': 1,\n
'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n
'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
1}\n- >> end captured logging << -"


> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> 

[jira] [Updated] (CASSANDRA-15892) JAVA 8/11: test_resumable_rebuild - rebuild_test.TestRebuild

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15892:
--
Summary: JAVA 8/11: test_resumable_rebuild - rebuild_test.TestRebuild  
(was: JAVA 11: test_resumable_rebuild - rebuild_test.TestRebuild)

> JAVA 8/11: test_resumable_rebuild - rebuild_test.TestRebuild
> 
>
> Key: CASSANDRA-15892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15892
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Ekaterina Dimitrova
>Assignee: Gianluca Righetto
>Priority: Normal
> Fix For: 4.0-rc
>
>
> JAVA 11:
> test_resumable_rebuild - rebuild_test.TestRebuild
> Fails locally and in  
> [CircleCI | 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/222/workflows/11202c7e-6c94-4d4e-bbbf-9e2fa9791ad0/jobs/1338]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15892) JAVA 11: test_resumable_rebuild - rebuild_test.TestRebuild

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15892:
--
Fix Version/s: (was: 4.0-triage)

> JAVA 11: test_resumable_rebuild - rebuild_test.TestRebuild
> --
>
> Key: CASSANDRA-15892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15892
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Ekaterina Dimitrova
>Assignee: Gianluca Righetto
>Priority: Normal
> Fix For: 4.0-rc
>
>
> JAVA 11:
> test_resumable_rebuild - rebuild_test.TestRebuild
> Fails locally and in  
> [CircleCI | 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/222/workflows/11202c7e-6c94-4d4e-bbbf-9e2fa9791ad0/jobs/1338]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15892) JAVA 11: test_resumable_rebuild - rebuild_test.TestRebuild

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209154#comment-17209154
 ] 

David Capwell commented on CASSANDRA-15892:
---

saw this today 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/622/workflows/adcd463c-156a-43c7-a9bc-7f3e4938dbe8/jobs/3514

> JAVA 11: test_resumable_rebuild - rebuild_test.TestRebuild
> --
>
> Key: CASSANDRA-15892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15892
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Ekaterina Dimitrova
>Assignee: Gianluca Righetto
>Priority: Normal
> Fix For: 4.0-rc, 4.0-triage
>
>
> JAVA 11:
> test_resumable_rebuild - rebuild_test.TestRebuild
> Fails locally and in  
> [CircleCI | 
> [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/222/workflows/11202c7e-6c94-4d4e-bbbf-9e2fa9791ad0/jobs/1338]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11063) Unable to compute ceiling for max when histogram overflowed

2020-10-06 Thread Paulo Motta (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11063:

Resolution: Fixed
Status: Resolved  (was: Open)

Closing as duplicate of CASSANDRA-7.

> Unable to compute ceiling for max when histogram overflowed
> ---
>
> Key: CASSANDRA-11063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
> Environment: Cassandra 2.1.9 on RHEL
>Reporter: Navjyot Nishant
>Priority: Normal
>  Labels: Compaction, thread
>
> Issue https://issues.apache.org/jira/browse/CASSANDRA-8028 seems related with 
> error we are getting. But we are getting this with Cassandra 2.1.9 when 
> autocompaction is running it keeps throwing following errors, we are unsure 
> if its a bug or can be resolved, please suggest.
> {code}
> WARN  [CompactionExecutor:3] 2016-01-23 13:30:40,907 SSTableWriter.java:240 - 
> Compacting large partition gccatlgsvcks/category_name_dedup:66611300 
> (138152195 bytes)
> ERROR [CompactionExecutor:1] 2016-01-23 13:30:50,267 CassandraDaemon.java:223 
> - Exception in thread Thread[CompactionExecutor:1,1,main]
> java.lang.IllegalStateException: Unable to compute ceiling for max when 
> histogram overflowed
> at 
> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:203)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:98)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.io.sstable.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1987)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:370)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:96)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:179)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:84)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:230)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_51]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> {code}
> h3. Additional info:
> *cfstats is running fine for that table...*
> {code}
> ~ $ nodetool cfstats gccatlgsvcks.category_name_dedup
> Keyspace: gccatlgsvcks
> Read Count: 0
> Read Latency: NaN ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Flushes: 0
> Table: category_name_dedup
> SSTable count: 6
> Space used (live): 836314727
> Space used (total): 836314727
> Space used by snapshots (total): 3621519
> Off heap memory used (total): 6930368
> SSTable Compression Ratio: 0.03725358753117693
> Number of keys (estimate): 3004
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 0
> Local read count: 0
> Local read latency: NaN ms
> Local write count: 0
> Local write latency: NaN ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 5240
> Bloom filter off heap memory used: 5192
> Index summary off heap memory used: 1200
> Compression metadata off heap memory used: 6923976
> Compacted partition minimum bytes: 125
> Compacted partition maximum bytes: 30753941057
> Compacted partition mean bytes: 8352388
> Average 

[jira] [Commented] (CASSANDRA-16127) NullPointerException when calling nodetool enablethrift

2020-10-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209139#comment-17209139
 ] 

David Capwell commented on CASSANDRA-16127:
---

Sorry, neglected this patch for a bit; picking up review comments today.

> NullPointerException when calling nodetool enablethrift
> ---
>
> Key: CASSANDRA-16127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Thrift
>Reporter: Tibor Repasi
>Assignee: David Capwell
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x
>
>
> Having thrift disabled, it's impossible to enable it again without restarting 
> the node:
> {code}
> $ nodetool statusthrift
> not running
> $ nodetool enablethrift
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageService.startRPCServer(StorageService.java:392)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
>   at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
>   at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
>   at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
>   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
>   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
>   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
>   at sun.rmi.transport.Transport$1.run(Transport.java:200)
>   at sun.rmi.transport.Transport$1.run(Transport.java:197)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>   at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209136#comment-17209136
 ] 

Brandon Williams commented on CASSANDRA-16182:
--

That sounds reasonable (if it hears about it and it changes its liveness.)  I 
will just point out, there was 30s of this before the replacement started, so 
this is truly edge case and bad timing.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16016) sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16016:
-
  Since Version: 4.0-alpha1
Source Control Link: 
https://github.com/apache/cassandra/commit/e8d3743b1aa25a23f04726903d0cbf61f9824fe0
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed.

> sstablemetadata unit test, docs and params parsing hardening
> 
>
> Key: CASSANDRA-16016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16016
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}. Also docs are missing some options 
> and args parsing is brittle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16016) sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16016:
-
Fix Version/s: (was: 4.0-beta)
   4.0-beta3

> sstablemetadata unit test, docs and params parsing hardening
> 
>
> Key: CASSANDRA-16016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16016
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}. Also docs are missing some options 
> and args parsing is brittle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16016) sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16016:
-
Status: Ready to Commit  (was: Review In Progress)

> sstablemetadata unit test, docs and params parsing hardening
> 
>
> Key: CASSANDRA-16016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16016
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}. Also docs are missing some options 
> and args parsing is brittle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16194:
---
Status: Changes Suggested  (was: Review In Progress)

[~jwest],
 the patch only contains your signature list. It needs the exported public key 
too.

{code}
gpg --list-sigs  && gpg --armor --export 
{code}

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new e8d3743  sstablemetadata unit test, docs and params parsing hardening
e8d3743 is described below

commit e8d3743b1aa25a23f04726903d0cbf61f9824fe0
Author: Bereng 
AuthorDate: Tue Oct 6 05:40:01 2020 +0200

sstablemetadata unit test, docs and params parsing hardening

Patch by Bereguer Blasi, reviewed by brandonwilliams for CASSANDRA-16016
---
 doc/source/tools/sstable/sstablemetadata.rst   |  8 +-
 .../cassandra/tools/SSTableLevelResetterTest.java  |  2 +-
 .../cassandra/tools/SSTableMetadataViewerTest.java | 94 --
 .../cassandra/tools/StandaloneSSTableUtilTest.java |  2 +-
 .../cassandra/tools/StandaloneScrubberTest.java|  2 +-
 .../cassandra/tools/StandaloneUpgraderTest.java|  4 +-
 .../cassandra/tools/StandaloneVerifierTest.java|  2 +-
 7 files changed, 83 insertions(+), 31 deletions(-)

diff --git a/doc/source/tools/sstable/sstablemetadata.rst 
b/doc/source/tools/sstable/sstablemetadata.rst
index 0a7a422..48a1de5 100644
--- a/doc/source/tools/sstable/sstablemetadata.rst
+++ b/doc/source/tools/sstable/sstablemetadata.rst
@@ -29,7 +29,11 @@ Usage
 sstablemetadata  
 
 =

---gc_grace_seconds  The gc_grace_seconds to use when calculating 
droppable tombstones
+-c,--colors  Use ANSI color sequences
+-g,--gc_grace_seconds   Time to use when calculating droppable 
tombstones
+-s,--scanFull sstable scan for additional details. 
Only available in 3.0+ sstables. Defaults: false
+-t,--timestamp_unit Time unit that cell timestamps are written 
with
+-u,--unicode Use unicode to draw histograms and progress 
bars
 =

 
 Print all the metadata
@@ -252,7 +256,7 @@ Example::
 sstablemetadata --gc_grace_seconds 4700 
/var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db
 | grep "Estimated droppable tombstones"
 Estimated droppable tombstones: 9.61E-6
 
-# if gc_grace_seconds was configured at 100, none of the tombstones would 
be currently droppable 
+# if gc_grace_seconds was configured at 5000, none of the tombstones would 
be currently droppable 
 sstablemetadata --gc_grace_seconds 5000 
/var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db
 | grep "Estimated droppable tombstones"
 Estimated droppable tombstones: 0.0
 
diff --git a/test/unit/org/apache/cassandra/tools/SSTableLevelResetterTest.java 
b/test/unit/org/apache/cassandra/tools/SSTableLevelResetterTest.java
index e413b14..3f1c892 100644
--- a/test/unit/org/apache/cassandra/tools/SSTableLevelResetterTest.java
+++ b/test/unit/org/apache/cassandra/tools/SSTableLevelResetterTest.java
@@ -73,7 +73,7 @@ public class SSTableLevelResetterTest extends OfflineToolUtils
 ToolResult tool = ToolRunner.invokeClass(SSTableLevelResetter.class, 
"--really-reset", "system_schema", "tables");
 assertThat(tool.getStdout(), 
CoreMatchers.containsStringIgnoringCase("Found no sstables,"));
 Assertions.assertThat(tool.getCleanedStderr()).isEmpty();
-assertEquals(0,tool.getExitCode());
+assertEquals(0, tool.getExitCode());
 assertCorrectEnvPostTest();
 }
 
diff --git 
a/test/unit/org/apache/cassandra/tools/SSTableMetadataViewerTest.java 
b/test/unit/org/apache/cassandra/tools/SSTableMetadataViewerTest.java
index db0c958..7fd1353 100644
--- a/test/unit/org/apache/cassandra/tools/SSTableMetadataViewerTest.java
+++ b/test/unit/org/apache/cassandra/tools/SSTableMetadataViewerTest.java
@@ -18,11 +18,14 @@
 
 package org.apache.cassandra.tools;
 
+import java.io.IOException;
 import java.util.Arrays;
 
-import org.apache.commons.codec.digest.DigestUtils;
+import com.google.common.base.CharMatcher;
+
 import org.apache.commons.lang3.tuple.Pair;
 
+import org.junit.BeforeClass;
 import org.junit.Test;
 import org.junit.runner.RunWith;
 
@@ -34,11 +37,18 @@ import org.hamcrest.CoreMatchers;
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertThat;
 import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
 
 @RunWith(OrderedJUnit4ClassRunner.class)
 public class SSTableMetadataViewerTest extends OfflineToolUtils
 {
+private static String sstable;
+
+@BeforeClass
+public static void setupTest() throws IOException
+{
+sstable = findOneSSTable("legacy_sstables", "legacy_ma_simple");
+}
+
 @Test
 public void testNoArgsPrintsHelp()

[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16194:
---
Reviewers: Michael Semb Wever, Michael Semb Wever  (was: Michael Semb Wever)
   Michael Semb Wever, Michael Semb Wever
   Status: Review In Progress  (was: Patch Available)

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209120#comment-17209120
 ] 

Paulo Motta commented on CASSANDRA-16182:
-

I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about the C via gossip. Likewise if node C learns about 
C' via gossip it should probably halt execution to prevent potential 
consistency violations.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16182) A replacement node, although completed bootstrap and joined ring according to itself, stuck in Joining state as per the peers

2020-10-06 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209120#comment-17209120
 ] 

Paulo Motta edited comment on CASSANDRA-16182 at 10/6/20, 8:10 PM:
---

I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about C via gossip. Likewise if node C learns about C' 
via gossip it should probably halt execution to prevent potential consistency 
violations.


was (Author: pauloricardomg):
I think the safest thing to prevent this edge case is to make C' abort 
replacement if it hears about the C via gossip. Likewise if node C learns about 
C' via gossip it should probably halt execution to prevent potential 
consistency violations.

> A replacement node, although completed bootstrap and joined ring according to 
> itself, stuck in Joining state as per the peers
> -
>
> Key: CASSANDRA-16182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
> Fix For: 3.0.x
>
>
> This issue occurred in a production 3.0.21 cluster.
> Here is what happened
> # We had, say, a three node Cassandra cluster with nodes A, B and C
> # C got "terminated by cloud provider" due to health check failure and a 
> replacement node C' got launched.
> # C' started bootstrapping data from its neighbors
> # Network flaw: Nodes A,B were still able to communicate with terminated node 
> C and consequently still have C as alive.
> # The replacement node C' learnt about C through gossip but was unable to 
> communicate with C and marked C as DOWN.
> # C' completed bootstrapping successfully and itself and its peers logged 
> this statement "Node C' will complete replacement of C for tokens 
> [-7686143363672898397]"
> # C' logged the statement "Nodes C' and C have the same token 
> -7686143363672898397. C' is the new owner"
> # C' started listening for thrift and cql clients
> # Peer nodes A and B logged "Node C' cannot complete replacement of alive 
> node C "
> # A few seconds later, A and B marked C as DOWN
> C' continued to log below lines in an endless fashion
> {code:java}
> Node C is now part of the cluster
> Nodes () and C' have the same token C.  Ignoring -7686143363672898397 (Needs 
> a log statement fix)
> FatClient C has been silent for 3ms, removing from gossip
> {code}
> My reasoning of what happened: 
> By the time replacement node (C') finished bootstrapping and announced it's 
> state to Normal, A and B were still able to communicate with the replacing 
> node C (while C' was not able to with C), and hence rejected C' replacing C. 
> C' does not know this and does not attempt to recommunicate its "Normal" 
> state to rest of the cluster. (Worth noting that A and B marked C as down 
> soon after)
> Gossip keeps telling C' to add C to its metadata, and C' keeps kicking C out 
> eventually based on FailureDetector. 
> Proposed fix:
> When C' is notified through gossip about C, and given both own the same token 
> and given C' has finished bootstrapping, C' can emit its Normal state again 
> which should fix this in my opinion (so long as A and B have marked C as 
> DOWN, which they did eventually)
> I ended up manually fixing this by restarting Cassandra on C', which forced 
> it to announce its "Normal" state via
> StorageService.initServer --> joinTokenRing() --> finishJoiningRing() --> 
> setTokens() --> setGossipTokens()
> Alternately, I could have possibly achieved the same behavior if I disabled 
> and enabled gossip via jmx/nodetool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209118#comment-17209118
 ] 

Brandon Williams commented on CASSANDRA-16194:
--

+1

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15584) 4.0 quality testing: Tooling - External Ecosystem

2020-10-06 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15584:
--
Description: 
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] |{color:#00875A} *DONE WITH 
ALPHA*{color} (need to be tested with beta) | [~sumanth.pasupuleti]| 
| [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Instaclustr Cassandra 
operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Instaclustr Cassandra Backup Restore | 
https://github.com/instaclustr/cassandra-backup]|{color:#00875A}*DONE*{color} | 
[~stefan.miklosovic]|
| [Instaclustr Cassandra Sidecar | 
https://github.com/instaclustr/cassandra-sidecar]|{color:#00875A}*DONE*{color} 
| [~stefan.miklosovic]|
| [Cassandra SSTable generator | 
https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
 [~stefan.miklosovic]|
| [Cassandra TTL Remover | https://github.com/instaclustr/TTLRemover] | 
{color:#00875A}*DONE*{color} |  [~stefan.miklosovic]|
| [Cassandra Everywhere Strategy | 
https://github.com/instaclustr/cassandra-everywhere-strategy] | 
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 
{color:#00875A}*DONE*{color}| [~jimdickinson]|
| [metric 
collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|
| [managment 
API|https://github.com/datastax/management-api-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|  

Columns descriptions:
* *Name*: Name and link to the tool official page
* *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any issue 
and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if testing 4.0 is 
part of your CI process.
* *Contact*: The person acting as the contact point for that tool. 

  was:
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] |{color:#00875A} *DONE WITH 
ALPHA*{color} (need to be tested with beta) | [~sumanth.pasupuleti]| 
| [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Instaclustr Cassandra 
operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Instaclustr Cassandra Backup Restore | 
https://github.com/instaclustr/cassandra-backup]|{color:#00875A}*DONE*{color} | 
[~stefan.miklosovic]|
| [Instaclustr Cassandra Sidecar | 
https://github.com/instaclustr/cassandra-sidecar]|{color:#00875A}*DONE*{color} 
| [~stefan.miklosovic]|
| [Cassandra SSTable generator | 
https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
 [~stefan.miklosovic]|
| [Cassandra TTL Remover | https://github.com/instaclustr/TTLRemover] | 
{color:#00875A}*DONE*{color} |  [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 

[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16194:

Test and Documentation Plan: N/A
 Status: Patch Available  (was: Open)

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16194:

Attachment: jwest-gpg-key.patch

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Attachments: jwest-gpg-key.patch
>
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16194:

 Bug Category: Parent values: Documentation(13562)
   Complexity: Normal
  Component/s: Build
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Add GPG key for jw...@apache.org
> 
>
> Key: CASSANDRA-16194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> I am working on releasing a new version of in-jvm dtest API and need to add 
> my GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16194) Add GPG key for jw...@apache.org

2020-10-06 Thread Jordan West (Jira)
Jordan West created CASSANDRA-16194:
---

 Summary: Add GPG key for jw...@apache.org
 Key: CASSANDRA-16194
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16194
 Project: Cassandra
  Issue Type: Bug
Reporter: Jordan West
Assignee: Jordan West


I am working on releasing a new version of in-jvm dtest API and need to add my 
GPG key to the KEYS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16117) Improve docs about frozen types and invert UDT/Tuple order

2020-10-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fábio Takeo Ueno updated CASSANDRA-16117:
-
Fix Version/s: (was: 4.0-triage)

> Improve docs about frozen types and invert UDT/Tuple order
> --
>
> Key: CASSANDRA-16117
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16117
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Fábio Takeo Ueno
>Assignee: Fábio Takeo Ueno
>Priority: Low
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, there's no documentation regarding frozen/non-frozen types.
> Also, a tuple is mentioned after the definition of UDT: "tuples can be though 
> as anonymous UDT with anonymous fields". Since in the code base a UDT is an 
> extension of a tuple, it would be nice to invert these in the docs.
> This issue addresses both topics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16115) New Cassandra website design, content and layout to work with Antora

2020-10-06 Thread Melissa Logan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209048#comment-17209048
 ] 

Melissa Logan commented on CASSANDRA-16115:
---

Ack, thanks [~pauloricardomg]!

> New Cassandra website design, content and layout to work with Antora
> 
>
> Key: CASSANDRA-16115
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16115
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Melissa Logan
>Assignee: Melissa Logan
>Priority: Normal
> Fix For: 4.0-rc, 4.0-triage
>
> Attachments: Screen Shot 2020-09-03 at 09.48.53.png
>
>
> This task is related to CASSANDRA-16066 (Update and rework the 
> cassandra-website material to work with Antora). The goal is to update the 
> front-end of the C* website (design, IA and content) to work with Antora to 
> help modernize the website as discussed on the [mailing 
> list|https://www.mail-archive.com/dev@cassandra.apache.org/msg15537.html].
> *Design Concepts:* A minimum of two homepage design concepts will be created 
> and shared for input, which will help standardize a brand palette for C* and 
> a design language for the site. This may include custom iconography and 
> graphics. The chosen design language will be used to develop the remaining 
> templates. 
> *Template Design*: It's estimated that 7 template designs will be needed 
> including the creation of several new pages: 
>  * Homepage template
>  * Toplevel template - e.g. Community.
>  * General template - Mostly textual with some images, e.g. Intro, Quickstart 
>  * “Library” template - A library of assets (links, downloads, logos etc) 
> that are sortable by metadata, e.g Resources, or Kafka's Powered By page).
>  * Blog landing template 
>  * Blog single template
>  * Docs template 
> *Website Content:* Along with new design will be a need for new or updated 
> content to fit the new page layouts. The intention is to use as much as 
> possible from existing content, and augment with new content where needed.
> *Template Development:* This includes the frontend development, such as any 
> HTML markup to achieve designs. HTML would be crafted so as to preserve any 
> backend/API calls, such that content is pulled in as designed. The majority 
> of the frontend work would come in the form of crafting CSS to bring the 
> designs to life, plus any minor Javascript to add subtle delights to key 
> pages.
> *Style Guide*: Once all is complete, a Style Guide be added to GitHub for 
> contributors.
> The [cassandra-website|https://github.com/apache/cassandra-website] 
> repository would need to be modified. Specific changes to be determined. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16165) Rename master branch to trunk in cassandra-website

2020-10-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16165:
---
Change Category: Semantic
 Complexity: Low Hanging Fruit
Component/s: Build
 Status: Open  (was: Triage Needed)

> Rename master branch to trunk in cassandra-website
> --
>
> Key: CASSANDRA-16165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16165
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16173) Update "Getting Started" document for Windows users

2020-10-06 Thread Paulo Motta (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-16173:

 Complexity: Low Hanging Fruit
Component/s: Documentation/Website

> Update "Getting Started" document for Windows users
> ---
>
> Key: CASSANDRA-16173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16173
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Yuki Morishita
>Priority: Normal
>
> This is a documentation follow up to CASSANDRA-16171.
> Since we are removing support for Windows, we should update ["Getting 
> Started" 
> guide|https://cassandra.apache.org/doc/latest/getting_started/index.html] to 
> include how-to's for Windows users for setting up Cassandra for dev 
> evnironment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-06 Thread Paulo Motta (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-16161:

Reviewers: Chris Lohfink

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Tool/nodetool
>Reporter: Cameron Zemek
>Assignee: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x, 3.11.8
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-06 Thread Paulo Motta (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-16161:

Test and Documentation Plan: Need to document new validation compaction 
throttle option.
 Status: Patch Available  (was: Open)

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Tool/nodetool
>Reporter: Cameron Zemek
>Assignee: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x, 3.11.8
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch trunk created (now 06395ed)

2020-10-06 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git.


  at 06395ed  ninja-fix to 'Blog Cassandra Usage Report 2020' after staging 
check: fixing blockquotes and images, from Melissa Logan

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16153:
-
Resolution: Not A Problem
Status: Resolved  (was: Triage Needed)

No problem, thanks for letting us know.

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
> Attachments: debug.log.2020-10-01.0.zip, system.log.2020-10-01.0.zip
>
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-06 Thread Thomas Steinmaurer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208986#comment-17208986
 ] 

Thomas Steinmaurer commented on CASSANDRA-16153:


[~brandon.williams], sorry for wasting your time. I have discovered that this 
is an issue on our side on how we start Cassandra. Feel free to close.

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
> Attachments: debug.log.2020-10-01.0.zip, system.log.2020-10-01.0.zip
>
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16120:
--
  Fix Version/s: (was: 4.0-beta)
 NA
Source Control Link: 
https://github.com/apache/cassandra/commit/63b172e137e0306aefd84f373963d8014c5a5efa
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add ability for jvm-dtest to grep instance logs
> ---
>
> Key: CASSANDRA-16120
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16120
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> One of the main gaps between python dtest and jvm dtest is python dtest 
> supports the ability to grep the logs of an instance; we need this capability 
> as some tests require validating logs were triggered.
> Pydocs for common log methods 
> {code}
> |  grep_log(self, expr, filename='system.log', from_mark=None)
> |  Returns a list of lines matching the regular expression in parameter
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors(self, filename='system.log')
> |  Returns a list of errors with stack traces
> |  in the Cassandra log of this node
> |
> |  grep_log_for_errors_from(self, filename='system.log', seek_start=0)
> {code}
> {code}
> |  watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
> verbose=False, filename='system.log')
> |  Watch the log until one or more (regular) expression are found.
> |  This methods when all the expressions have been found or the method
> |  timeouts (a TimeoutError is then raised). On successful completion,
> |  a list of pair (line matched, match object) is returned.
> {code}
> Below is a POC showing a way to do such logic
> {code}
> package org.apache.cassandra.distributed.test;
> import java.io.BufferedReader;
> import java.io.FileInputStream;
> import java.io.IOException;
> import java.io.InputStreamReader;
> import java.io.UncheckedIOException;
> import java.nio.charset.StandardCharsets;
> import java.util.Iterator;
> import java.util.Spliterator;
> import java.util.Spliterators;
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.stream.Stream;
> import java.util.stream.StreamSupport;
> import com.google.common.io.Closeables;
> import org.junit.Test;
> import org.apache.cassandra.distributed.Cluster;
> import org.apache.cassandra.utils.AbstractIterator;
> public class AllTheLogs extends TestBaseImpl
> {
>@Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>String suite = System.getProperty("suitename", 
> "suitename_IS_UNDEFINED");
>String log = String.format("build/test/logs/%s/TEST-%s.log", tag, 
> suite);
>grep(log, "Enqueuing flush of tables").forEach(l -> 
> System.out.println("I found the thing: " + l));
>}
>}
>private static Stream grep(String file, String regex) throws 
> IOException
>{
>return grep(file, Pattern.compile(regex));
>}
>private static Stream grep(String file, Pattern regex) throws 
> IOException
>{
>BufferedReader reader = new BufferedReader(new InputStreamReader(new 
> FileInputStream(file), StandardCharsets.UTF_8));
>Iterator it = new AbstractIterator()
>{
>protected String computeNext()
>{
>try
>{
>String s;
>while ((s = reader.readLine()) != null)
>{
>Matcher m = regex.matcher(s);
>if (m.find())
>return s;
>}
>reader.close();
>return endOfData();
>}
>catch (IOException e)
>{
>Closeables.closeQuietly(reader);
>throw new UncheckedIOException(e);
>}
>}
>};
>return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 
> Spliterator.ORDERED), false);
>}
> }
> {code}
> And
> {code}
> @Test
>public void test() throws IOException
>{
>try (final Cluster cluster = init(Cluster.build(1).start()))
>{
>String tag = System.getProperty("cassandra.testtag", 
> "cassandra.testtag_IS_UNDEFINED");
>

[jira] [Updated] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16101:
--
Status: Ready to Commit  (was: Review In Progress)

> Make sure we don't throw any uncaught exceptions during in-jvm dtests
> -
>
> Key: CASSANDRA-16101
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16101
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>  Labels: pull-request-available
>
> We should assert that we don't throw any uncaught exceptions when running 
> in-jvm dtests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16109) Don't adjust nodeCount when setting node id topology in in-jvm dtests

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16109:
--
  Fix Version/s: NA
Source Control Link: 
https://github.com/apache/cassandra/commit/b3013a4ac5ee816cafe7492775126d1fa72ced75
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Don't adjust nodeCount when setting node id topology in in-jvm dtests
> -
>
> Key: CASSANDRA-16109
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16109
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
>  Labels: pull-request-available
> Fix For: NA
>
>
> We update the node count when setting the node id topology in in-jvm dtests, 
> this should only happen if node count is smaller than the node id topology, 
> otherwise bootstrap tests error out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16101:
--
Reviewers: Alex Petrov, David Capwell, David Capwell  (was: Alex Petrov, 
David Capwell)
   Alex Petrov, David Capwell, David Capwell  (was: Alex Petrov, 
David Capwell)
   Status: Review In Progress  (was: Patch Available)

> Make sure we don't throw any uncaught exceptions during in-jvm dtests
> -
>
> Key: CASSANDRA-16101
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16101
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>  Labels: pull-request-available
>
> We should assert that we don't throw any uncaught exceptions when running 
> in-jvm dtests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests

2020-10-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16101:
--
  Fix Version/s: NA
Source Control Link: 
https://github.com/apache/cassandra/commit/b3013a4ac5ee816cafe7492775126d1fa72ced75
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Make sure we don't throw any uncaught exceptions during in-jvm dtests
> -
>
> Key: CASSANDRA-16101
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16101
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> We should assert that we don't throw any uncaught exceptions when running 
> in-jvm dtests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15899) Dropping a column can break queries until the schema is fully propagated

2020-10-06 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15899:

  Fix Version/s: (was: 3.0.x)
 4.0-beta3
 3.11.9
 3.0.23
  Since Version: 3.0.0
Source Control Link: 
https://github.com/apache/cassandra/commit/31b9078a691a6f93b104cc6b3f72fe2fbf6557f6
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Dropping a column can break queries until the schema is fully propagated
> 
>
> Key: CASSANDRA-15899
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15899
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Marcus Eriksson
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.23, 3.11.9, 4.0-beta3
>
>
> With a table like:
> {code}
> CREATE TABLE ks.tbl (id int primary key, v1 int, v2 int, v3 int)
> {code}
> and we drop {{v2}}, we get this exception on the replicas which haven't seen 
> the schema change:
> {code}
> ERROR [SharedPool-Worker-1] node2 2020-06-24 09:49:08,107 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,node2]
> java.lang.IllegalStateException: [ColumnDefinition{name=v1, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=v2, type=org.apache.cassandra.db.marshal.Int32Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=v3, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}] 
> is not a subset of [v1 v3]
>   at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:546) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:478) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:184)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:114)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:102)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
> ...
> {code}
> Note that it doesn't matter if we {{SELECT *}} or {{SELECT id, v1}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16016) sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16016:
-
Reviewers: Brandon Williams, Brandon Williams  (was: Brandon Williams)
   Brandon Williams, Brandon Williams
   Status: Review In Progress  (was: Patch Available)

> sstablemetadata unit test, docs and params parsing hardening
> 
>
> Key: CASSANDRA-16016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16016
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}. Also docs are missing some options 
> and args parsing is brittle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16016) sstablemetadata unit test, docs and params parsing hardening

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16016:
-
Status: Patch Available  (was: In Progress)

> sstablemetadata unit test, docs and params parsing hardening
> 
>
> Key: CASSANDRA-16016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16016
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}. Also docs are missing some options 
> and args parsing is brittle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15986) Repair tests flakiness

2020-10-06 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15986:
-
  Fix Version/s: (was: 4.0-beta)
 4.0-beta3
Source Control Link: 
https://github.com/apache/cassandra-dtest/commit/79c0120fbd659a7a5169ed945279dfed688705d8
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Repair tests flakiness
> --
>
> Key: CASSANDRA-15986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15986
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repair tests come up in test failure reports every now and then. I have tried 
> to repro the 
> [latest|https://ci-cassandra.apache.org/job/Cassandra-trunk/241/testReport/junit/dtest-novnode.repair_tests.repair_test/TestRepair/test_simple_sequential_repair/]
>  locally 100 times with no luck.
> _dtest-novnode.repair_tests.repair_test/TestRepair/test_simple_sequential_repair_
> Still from experience from fixing other flaky tests I have some intuition 
> where the problems may lie. The proposed fix should add no harm if merged. We 
> can reopen the ticket if repair tests keep failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool

2020-10-06 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208889#comment-17208889
 ] 

Yifan Cai commented on CASSANDRA-16057:
---

Rebased to the trunk. Fixed the added test due to the dtest api change. 

> Should update in-jvm dtest to expose stdout and stderr for nodetool
> ---
>
> Key: CASSANDRA-16057
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16057
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: Yifan Cai
>Priority: Normal
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Many nodetool commands output to stdout or stderr so running nodetool using 
> in-jvm dtest should expose that to tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   >