[ https://issues.apache.org/jira/browse/CASSANDRA-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056456#comment-17056456 ]
Ekaterina Dimitrova commented on CASSANDRA-15614: ------------------------------------------------- Thanks [~jrwest] As we talked on slack, we will commit this one for now and keep the ticket open for the exploration of in-jvm better version later. Thanks! [~brandon.williams], can you, please commit this patch? > Flakey test - TestBootstrap.test_resumable_bootstrap > ----------------------------------------------------- > > Key: CASSANDRA-15614 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15614 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest > Reporter: Ekaterina Dimitrova > Assignee: Ekaterina Dimitrova > Priority: Normal > Fix For: 4.0-alpha > > Attachments: test_resumable_bootstrap-CASSANDRA-15614.txt > > > Fails 2 out of 10 times on Mac/Java 8 > {noformat} > ================================================================================================================== > FAILURES > ================================================================================================================== > *___________________________________________________________________________________________________ > TestBootstrap.test_resumable_bootstrap > ___________________________________________________________________________________________________* > > self = <bootstrap_test.TestBootstrap object at 0x10fd488d0> > > *@since('2.2')* > *def test_resumable_bootstrap(self):* > *"""* > *Test resuming bootstrap after data streaming failure* > *"""* > *cluster = self.cluster* > *cluster.populate(2)* > ** > *node1 = cluster.nodes['node1']* > *# set up byteman* > *node1.byteman_port = '8100'* > *node1.import_config_files()* > ** > *cluster.start(wait_other_notice=True)* > *# kill stream to node3 in the middle of streaming to let it fail* > *if cluster.version() < '4.0':* > *node1.byteman_submit([self.byteman_submit_path_pre_4_0])* > *else:* > *node1.byteman_submit([self.byteman_submit_path_4_0])* > *node1.stress(['write', 'n=1K', 'no-warmup', 'cl=TWO', '-schema', > 'replication(factor=2)', '-rate', 'threads=50'])* > *cluster.flush()* > ** > *# start bootstrapping node3 and wait for streaming* > *node3 = new_node(cluster)* > *node3.start(wait_other_notice=False)* > ** > *# let streaming fail as we expect* > *> node3.watch_log_for('Some data streaming failed')* > > *bootstrap_test.py*:365: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ > > self = <ccmlib.node.Node object at 0x10fdf1810>, exprs = 'Some data streaming > failed', from_mark = None, timeout = 600, process = None, verbose = False, > filename = 'system.log' > > *def watch_log_for(self, exprs, from_mark=None, timeout=600, > process=None, verbose=False, filename='system.log'):* > *"""* > *Watch the log until one or more (regular) expression are found.* > *This methods when all the expressions have been found or the > method* > *timeouts (a TimeoutError is then raised). On successful > completion,* > *a list of pair (line matched, match object) is returned.* > *"""* > *start = time.time()* > *tofind = [exprs] if isinstance(exprs, string_types) else exprs* > *tofind = [re.compile(e) for e in tofind]* > *matchings = []* > *reads = ""* > *if len(tofind) == 0:* > *return None* > ** > *log_file = os.path.join(self.get_path(), 'logs', filename)* > *output_read = False* > *while not os.path.exists(log_file):* > *time.sleep(.5)* > *if start + timeout < time.time():* > *raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", > time.gmtime()) + " [" + self.name + "] Timed out waiting for {} to be > created.".format(log_file))* > *if process and not output_read:* > *process.poll()* > *if process.returncode is not None:* > *self.print_process_output(self.name, process, verbose)* > *output_read = True* > *if process.returncode != 0:* > *raise RuntimeError() # Shouldn't reuse RuntimeError > but I'm lazy* > ** > *with open(log_file) as f:* > *if from_mark:* > *f.seek(from_mark)* > ** > *while True:* > *# First, if we have a process to check, then check it.* > *# Skip on Windows - stdout/stderr is cassandra.bat* > *if not common.is_win() and not output_read:* > *if process:* > *process.poll()* > *if process.returncode is not None:* > *self.print_process_output(self.name, process, > verbose)* > *output_read = True* > *if process.returncode != 0:* > *raise RuntimeError() # Shouldn't reuse > RuntimeError but I'm lazy* > ** > *line = f.readline()* > *if line:* > *reads = reads + line* > *for e in tofind:* > *m = e.search(line)* > *if m:* > *matchings.append((line, m))* > *tofind.remove(e)* > *if len(tofind) == 0:* > *return matchings[0] if isinstance(exprs, > string_types) else matchings* > *else:* > *# yep, it's ugly* > *time.sleep(1)* > *if start + timeout < time.time():* > *> raise TimeoutError(time.strftime("%d %b %Y > %H:%M:%S", time.gmtime()) + " [" + self.name + "] Missing: " + str([e.pattern > for e in tofind]) + ":\n" + reads[:50] + ".....\nSee {} for > remainder".format(filename))* > *E ccmlib.node.TimeoutError: 02 Mar 2020 17:16:34 > [node3] Missing: ['Some data streaming failed']:* > *E INFO [main] 2020-03-02 12:06:34,963 > YamlConfigura.....* > *E See system.log for remainder* > > *../../dtest-new2/src/ccm/ccmlib/node.py*:536: TimeoutError > ----------------------------------------------------------------------------------------------------------- > Captured stdout setup > ------------------------------------------------------------------------------------------------------------ > 11:55:37,343 ccm DEBUG Log-watching thread starting. > ------------------------------------------------------------------------------------------------------------- > Captured log setup > ------------------------------------------------------------------------------------------------------------- > 11:55:37,239 conftest INFO Starting execution of test_resumable_bootstrap at > 2020-03-02 11:55:37.239467 > 11:55:37,240 dtest_setup INFO cluster ccm directory: > /var/folders/ql/nvcz74bd67d3vhw7227mpjm40000gp/T/dtest-fzxwwt9c > ------------------------------------------------------------------------------------------------------------ > Captured stdout call > ------------------------------------------------------------------------------------------------------------ > install rule inject stream failure > > ---------------------------------------------------------------------------------------------------------- > Captured stdout teardown > ---------------------------------------------------------------------------------------------------------- > 12:06:04,870 ccm DEBUG Log-watching thread exiting. > ----------------------------------------------------------------------------------------------------------- > Captured stdout setup > ------------------------------------------------------------------------------------------------------------ > 12:06:06,20 ccm DEBUG Log-watching thread starting. > ------------------------------------------------------------------------------------------------------------- > Captured log setup > ------------------------------------------------------------------------------------------------------------- > 12:06:05,887 conftest INFO Starting execution of test_resumable_bootstrap at > 2020-03-02 12:06:05.887592 > 12:06:05,888 dtest_setup INFO cluster ccm directory: > /var/folders/ql/nvcz74bd67d3vhw7227mpjm40000gp/T/dtest-yntcltkx > ------------------------------------------------------------------------------------------------------------ > Captured stdout call > ------------------------------------------------------------------------------------------------------------ > install rule inject stream failure > > ---------------------------------------------------------------------------------------------------------- > Captured stdout teardown > ---------------------------------------------------------------------------------------------------------- > 12:06:04,870 ccm DEBUG Log-watching thread exiting. > ---------------------------------------------------------------------------------------------------------- > Captured stdout teardown > ---------------------------------------------------------------------------------------------------------- > 12:06:04,870 ccm DEBUG Log-watching thread exiting. > ---------------------------------------------------------------------------------------------------------- > Captured stdout teardown > ---------------------------------------------------------------------------------------------------------- > 12:16:34,818 ccm DEBUG Log-watching thread exiting. > ===Flaky Test Report=== > > test_resumable_bootstrap failed (1 runs remaining out of 2). > <class 'ccmlib.node.TimeoutError'> > 02 Mar 2020 17:06:04 [node3] Missing: ['Some data streaming failed']: > INFO [main] 2020-03-02 11:56:05,081 YamlConfigura..... > See system.log for remainder > [<TracebackEntry > /Users/ekaterina.dimitri/IdeaProjects/cassandra-dtest-d/bootstrap_test.py:365>, > <TracebackEntry > /Users/ekaterina.dimitri/dtest-new2/src/ccm/ccmlib/node.py:536>] > test_resumable_bootstrap failed; it passed 0 out of the required 1 times. > <class 'ccmlib.node.TimeoutError'> > 02 Mar 2020 17:16:34 [node3] Missing: ['Some data streaming failed']: > INFO [main] 2020-03-02 12:06:34,963 YamlConfigura..... > See system.log for remainder > [<TracebackEntry > /Users/ekaterina.dimitri/IdeaProjects/cassandra-dtest-d/bootstrap_test.py:365>, > <TracebackEntry > /Users/ekaterina.dimitri/dtest-new2/src/ccm/ccmlib/node.py:536>] > > ===End Flaky Test Report=== > *======================================================================================================== > 1 failed in 1258.32 seconds > =========================================================================================================* > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org