Hey Dan (+CC dev in case anyone else knows about this too) I'm debugging some flakiness in alter_table-randomized-test, and ti seems like it's failing because the verification scan is returning some out of order rows, despite using "SetFaultTolerant()". Granted, fault tolerance isn't publicly guaranteed to return rows in order, but I was under the impression that, with range partitioned tablets, it would always do so.
The scan result I'm seeing has the following sequence within it: (int32 key=537424064, int32 c945=NULL, int32 c79=234639860, int32 c990=NULL) >>>> OUT OF ORDER ROW (int32 key=552025439, int32 c945=NULL, int32 c79=234639860, int32 c990=NULL) >>>> BACK TO NORMAL ORDER (int32 key=539314778, int32 c945=1708089980, int32 c79=-878787336, int32 c990=829302644) (int32 key=541817227, int32 c945=2064952224, int32 c79=2064952224, int32 c990=NULL) (int32 key=546056206, int32 c945=26527696, int32 c79=26527696, int32 c990=26527696) (int32 key=601960253, int32 c945=NULL, int32 c79=1088757503, int32 c990=NULL) (int32 key=677154987, int32 c945=823764490, int32 c79=823764490, int32 c990=823764490) The prior alter was: I1004 05:17:48.192611 28113 alter_table-randomized-test.cc:481] Dropping range partition: [805306356, 872415219) resulting partitions: (134217726, 201326589], (268435452, 335544315], (335544315, 402653178], (402653178, 469762041], (536870904, 603979767], (671088630, 738197493], (738197493, 805306356], (939524082, 1006632945], (1006632945, 1073741808], (1275068397, 1342177260], (1342177260, 1409286123], (1409286123, 1476394986], (1610612712, 1677721575], (1879048164, 1946157027], (2013265890, 2080374753], (2080374753, 2147483616) I1004 05:17:48.193013 28113 alter_table-randomized-test.cc:406] Committing Alterations The whole log is available here: https://gist.githubusercontent.com/toddlipcon/466976caf973f496885da9efc2f7246c/raw/f9baf418dad4ad07f33961b131c86e84803815a8/alter_table-randomized-test.txt Any ideas what might be causing this out-of-order result? Is the test making some incorrect assumptions or might we have a bug? -Todd -- Todd Lipcon Software Engineer, Cloudera