Todd Lipcon created KUDU-1501:
---------------------------------

             Summary: RaftConsensusITest.TestMasterReplacesEvictedFollowers 
flaky with bootstrap reply error
                 Key: KUDU-1501
                 URL: https://issues.apache.org/jira/browse/KUDU-1501
             Project: Kudu
          Issue Type: Bug
          Components: consensus, tablet
    Affects Versions: 0.9.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
            Priority: Critical


I looped this test a couple hundred times in TSAN and caught this failure, 
which looks like it might be a serious bug:

W0624 01:56:54.399389 30339 consensus_peers.cc:326] T 
fbfe7e1538d442fb9d4e0958c6ed3b7a P 9ee39d9aaf9840b9a502817a4cfe0a68 -> Peer 
e9f2622cd2f64f95ad5ebc9706524bab (127.112.159.2:58860): Couldn't send request 
to peer e9f2622cd2f64f95ad5ebc9706524bab for tablet 
fbfe7e1538d442fb9d4e0958c6ed3b7a. Error code: TABLET_NOT_RUNNING (12). Status: 
Illegal state: Tablet not RUNNING: FAILED: Corruption: Failed log replay. 
Reason: Debug Info: Error playing entry 3 of segment 6 of tablet 
fbfe7e1538d442fb9d4e0958c6ed3b7a. Segment path: 
/tmp/kudutest-1000/raft_consensus-itest.RaftConsensusITest.TestMasterReplacesEvictedFollowers.1466733357563157-28831/raft_consensus-itest-cluster/ts-2/wals/fbfe7e1538d442fb9d4e0958c6ed3b7a.recovery/wal-000000006.
 Entry: type: COMMIT commit { op_type: WRITE_OP commited_op_id { term: 1 index: 
36 } result { ops { mutated_stores { mrs_id: 3 } } } }: CommitMsg was orphaned 
but it referred to stores which need replay. Commit: op_type: WRITE_OP 
commited_op_id { term: 1 index: 36 } result { ops { mutated_stores { mrs_id: 3 
} } }. TabletMetadata: table_id: "9fb52e694c1d46e4991b49b78a3b8acf" tablet_id: 
"fbfe7e1538d442fb9d4e0958c6ed3b7a" last_durable_mrs_id: 2 rowsets { id: 3 
last_durable_dms_id: -1 columns { block { id: 1836738791030108424 } column_id: 
10 } columns { block { id: 3331501504918373718 } column_id: 11 } columns { 
block { id: 4024765891195703834 } column_id: 12 } undo_deltas { block { id: 
3564657040239809453 } } bloom_block { id: 3499726779858777197 } } table_name: 
"TestTable" schema { columns { id: 10 name: "key" type: INT32 is_key: true 
is_nullable: false encoding: AUTO_ENCODING compression: DEFAULT_COMPRESSION 
cfile_block_size: 0 } columns { id: 11 name: "int_val" type: INT32 is_key: 
false is_nullable: false encoding: AUTO_ENCODING compression: 
DEFAULT_COMPRESSION cfile_block_size: 0 } columns { id: 12 name: "string_val" 
type: STRING is_key: false is_nullable: true encoding: AUTO_ENCODING 
compression: DEFAULT_COMPRESSION cfile_block_size: 0 } } schema_version: 0 
tablet_data_state: TABLET_DATA_READY partition { partition_key_start: "" 
partition_key_end: "" } partition_schema { range_schema { columns { id: 10 } } 
}. Retrying in the next heartbeat period. Already tried 13 times.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to