Hey Dan (+CC dev in case anyone else knows about this too)

I'm debugging some flakiness in alter_table-randomized-test, and ti seems
like it's failing because the verification scan is returning some out of
order rows, despite using "SetFaultTolerant()". Granted, fault tolerance
isn't publicly guaranteed to return rows in order, but I was under the
impression that, with range partitioned tablets, it would always do so.

The scan result I'm seeing has the following sequence within it:

(int32 key=537424064, int32 c945=NULL, int32 c79=234639860, int32 c990=NULL)
>>>> OUT OF ORDER ROW
(int32 key=552025439, int32 c945=NULL, int32 c79=234639860, int32 c990=NULL)
>>>> BACK TO NORMAL ORDER
(int32 key=539314778, int32 c945=1708089980, int32 c79=-878787336, int32
c990=829302644)
(int32 key=541817227, int32 c945=2064952224, int32 c79=2064952224, int32
c990=NULL)
(int32 key=546056206, int32 c945=26527696, int32 c79=26527696, int32
c990=26527696)
(int32 key=601960253, int32 c945=NULL, int32 c79=1088757503, int32
c990=NULL)
(int32 key=677154987, int32 c945=823764490, int32 c79=823764490, int32
c990=823764490)

The prior alter was:
I1004 05:17:48.192611 28113 alter_table-randomized-test.cc:481] Dropping
range partition: [805306356, 872415219) resulting partitions: (134217726,
201326589], (268435452, 335544315], (335544315, 402653178], (402653178,
469762041], (536870904, 603979767], (671088630, 738197493], (738197493,
805306356], (939524082, 1006632945], (1006632945, 1073741808], (1275068397,
1342177260], (1342177260, 1409286123], (1409286123, 1476394986],
(1610612712, 1677721575], (1879048164, 1946157027], (2013265890,
2080374753], (2080374753, 2147483616)
I1004 05:17:48.193013 28113 alter_table-randomized-test.cc:406] Committing
Alterations

The whole log is available here:
https://gist.githubusercontent.com/toddlipcon/466976caf973f496885da9efc2f7246c/raw/f9baf418dad4ad07f33961b131c86e84803815a8/alter_table-randomized-test.txt

Any ideas what might be causing this out-of-order result? Is the test
making some incorrect assumptions or might we have a bug?

-Todd






-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to