Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite
On Fri, Mar 02, 2018 at 11:32:44PM -0500, James McCoy wrote: > Thanks (again) for the very thorough analysis. Sorry it's taken so long > for me to circle back to this. > > If I understand correctly, the easiest fix would be to memcpy() > tc->source to an appropriately sized apr_uint16_t/apr_int32_t array. > That could then be used for the svn_utf__utf{16,32}_to_utf8() calls. > > If so, could you try the attached patch? Yes, with that patch subversion builds successfully on Alpha. Cheers Michael.
Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite
On Mon, May 02, 2016 at 08:47:03PM +1200, Michael Cree wrote: > On Mon, May 02, 2016 at 12:01:18AM -0400, James McCoy wrote: > > Thanks for the analysis and working on a patch. I'll note that > > _Alignas, if it had worked, would likely have been too new a compiler > > feature to use. > > > > It looks like Debian's ia64 porterbox is still running, so I can try > > reproducing the problem there (using prctl to force SIGBUS) and then > > fiddling around with the code. > > Here goes further analysis. Running utf-test under gdb I set a break > point at line 807 of utf-test.c and after a number of iterations of > the enclosing loop I get: > > Breakpoint 1, test_utf_conversions (pool=0x20001018028) > at > /extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c:807 > 807 SVN_ERR_ASSERT(0 == strcmp(result->data, tc->result)); > (gdb) print tc->source > $20 = 0x120003117 "\330\064\335\036" > (gdb) print result->data > $21 = 0x20001018218 "\360\220\204\236" > (gdb) print tc->result > $22 = 0x12000373c "\360\235\204\236" > (gdb) cont > Continuing. > svn_tests: E235000: In file > '/extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c' > line 807: assertion failed (0 == strcmp(result->data, tc->result)) > FAIL: lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8 > [Inferior 1 (process 24827) exited with code 01] > > > > On this iteration I believe it has just read line 769 of utf-test.c, > i.e., this line: > > { UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* > U+01D11E */ > > into variable tc, and it passes tc->source (which is at address > 0x120003117, i.e., misaligned for apr_uint16_t type) to > svn_utf__utf16_to_utf8() and the result is incorrect and results in > the failed assert. Thanks (again) for the very thorough analysis. Sorry it's taken so long for me to circle back to this. If I understand correctly, the easiest fix would be to memcpy() tc->source to an appropriately sized apr_uint16_t/apr_int32_t array. That could then be used for the svn_utf__utf{16,32}_to_utf8() calls. If so, could you try the attached patch? Cheers, -- James GPG Key: 4096R/91BF BF4D 6956 BD5D F7B7 2D23 DFE6 91AE 331B A3DB Index: trunk/subversion/tests/libsvn_subr/utf-test.c === --- trunk/subversion/tests/libsvn_subr/utf-test.c (revision 1825748) +++ trunk/subversion/tests/libsvn_subr/utf-test.c (working copy) @@ -752,8 +752,10 @@ { svn_boolean_t sixteenbit; svn_boolean_t bigendian; +apr_size_t sourcelen; const char *source; const char *result; +svn_boolean_t counted; } tests[] = { #define UTF_32_LE FALSE, FALSE @@ -762,33 +764,37 @@ #define UTF_16_BE TRUE, TRUE /* Normal character conversion */ -{ UTF_32_LE, "t\0\0\0" "e\0\0\0" "s\0\0\0" "t\0\0\0" "\0\0\0\0", "test" }, -{ UTF_32_BE, "\0\0\0t" "\0\0\0e" "\0\0\0s" "\0\0\0t" "\0\0\0\0", "test" }, -{ UTF_16_LE, "t\0" "e\0" "s\0" "t\0" "\0\0", "test" }, -{ UTF_16_BE, "\0t" "\0e" "\0s" "\0t" "\0\0", "test" }, +{ UTF_32_LE, 4, "t\0\0\0" "e\0\0\0" "s\0\0\0" "t\0\0\0" "\0\0\0\0", "test", FALSE }, +{ UTF_32_BE, 4, "\0\0\0t" "\0\0\0e" "\0\0\0s" "\0\0\0t" "\0\0\0\0", "test", FALSE }, +{ UTF_16_LE, 4, "t\0" "e\0" "s\0" "t\0" "\0\0", "test", FALSE }, +{ UTF_16_BE, 4, "\0t" "\0e" "\0s" "\0t" "\0\0", "test", FALSE }, /* Valid surrogate pairs */ -{ UTF_16_LE, "\x00\xD8" "\x00\xDC" "\0\0", "\xf0\x90\x80\x80" }, /* U+01 */ -{ UTF_16_LE, "\x34\xD8" "\x1E\xDD" "\0\0", "\xf0\x9d\x84\x9e" }, /* U+01D11E */ -{ UTF_16_LE, "\xFF\xDB" "\xFD\xDF" "\0\0", "\xf4\x8f\xbf\xbd" }, /* U+10FFFD */ +{ UTF_16_LE, 2, "\x00\xD8" "\x00\xDC" "\0\0", "\xf0\x90\x80\x80", FALSE }, /* U+01 */ +{ UTF_16_LE, 2, "\x34\xD8" "\x1E\xDD" "\0\0", "\xf0\x9d\x84\x9e", FALSE }, /* U+01D11E */ +{ UTF_16_LE, 2, "\xFF\xDB" "\xFD\xDF" "\0\0", "\xf4\x8f\xbf\xbd", FALSE }, /* U+10FFFD */ -{ UTF_16_BE, "\xD8\x00" "\xDC\x00" "\0\0", "\xf0\x90\x80\x80" }, /* U+01 */ -{ UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* U+01D11E */ -{ UTF_16_BE, "\xDB\xFF" "\xDF\xFD" "\0\0", "\xf4\x8f\xbf\xbd" }, /* U+10FFFD */ +{ UTF_16_BE, 2, "\xD8\x00" "\xDC\x00" "\0\0", "\xf0\x90\x80\x80", FALSE }, /* U+01 */ +{ UTF_16_BE, 2, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e", FALSE }, /* U+01D11E */ +{ UTF_16_BE, 2, "\xDB\xFF" "\xDF\xFD" "\0\0", "\xf4\x8f\xbf\xbd", FALSE }, /* U+10FFFD */ /* Swapped, single and trailing surrogate pairs */ -{ UTF_16_LE, "*\0" "\x00\xDC" "\x00\xD8" "*\0\0\0", "*\xed\xb0\x80" "\xed\xa0\x80*" }, -{ UTF_16_LE, "*\0" "\x1E\xDD" "*\0\0\0", "*\xed\xb4\x9e*" }, -{ UTF_16_LE, "*\0" "\xFF\xDB" "*\0\0\0", "*\xed\xaf\xbf*" }, -{ UTF_16_LE, "\x1E\xDD" "\0\0", "\xed\xb4\x9e" }, -{ UTF_16_LE, "\xFF\xDB" "\0\0", "\xed\xaf\xbf" }, +{ UTF_16_LE, 4, "*\0" "\x00\xDC" "\x00\xD8"
Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite
On Mon, May 02, 2016 at 12:01:18AM -0400, James McCoy wrote: > Thanks for the analysis and working on a patch. I'll note that > _Alignas, if it had worked, would likely have been too new a compiler > feature to use. > > It looks like Debian's ia64 porterbox is still running, so I can try > reproducing the problem there (using prctl to force SIGBUS) and then > fiddling around with the code. Here goes further analysis. Running utf-test under gdb I set a break point at line 807 of utf-test.c and after a number of iterations of the enclosing loop I get: Breakpoint 1, test_utf_conversions (pool=0x20001018028) at /extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c:807 807 SVN_ERR_ASSERT(0 == strcmp(result->data, tc->result)); (gdb) print tc->source $20 = 0x120003117 "\330\064\335\036" (gdb) print result->data $21 = 0x20001018218 "\360\220\204\236" (gdb) print tc->result $22 = 0x12000373c "\360\235\204\236" (gdb) cont Continuing. svn_tests: E235000: In file '/extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c' line 807: assertion failed (0 == strcmp(result->data, tc->result)) FAIL: lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8 [Inferior 1 (process 24827) exited with code 01] On this iteration I believe it has just read line 769 of utf-test.c, i.e., this line: { UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* U+01D11E */ into variable tc, and it passes tc->source (which is at address 0x120003117, i.e., misaligned for apr_uint16_t type) to svn_utf__utf16_to_utf8() and the result is incorrect and results in the failed assert. On previous iterations of the test loop it happened to pass the test even with misaligned tc->source strings, and it is instructive to disassemble svn_utf__utf16_to_utf8() to see why this can happen: 1040svn_error_t * 1041svn_utf__utf16_to_utf8(const svn_string_t **result, 1042 const apr_uint16_t *utf16str, 1043 apr_size_t utf16len, 1044 svn_boolean_t big_endian, 1045 apr_pool_t *result_pool, 1046 apr_pool_t *scratch_pool) 1047{ 0x020ed990 <+0>: ldahgp,12(t12) 0x020ed994 <+4>: lda gp,10976(gp) 0x020ed998 <+8>: lda sp,-160(sp) 0x020ed9a0 <+16>:stq s0,8(sp) 0x020ed9a4 <+20>:mov a1,s0 0x020ed9a8 <+24>:stq s1,16(sp) 0x020ed9ac <+28>:mov a2,s1 0x020ed9b0 <+32>:stq ra,0(sp) 0x020ed9b4 <+36>:stq s2,24(sp) 0x020ed9b8 <+40>:stq s3,32(sp) 0x020ed9bc <+44>:stq s4,40(sp) 0x020ed9c0 <+48>:stq s5,48(sp) 0x020ed9c4 <+52>:stq fp,56(sp) 0x020ed9c8 <+56>:stq a0,152(sp) 0x020ed9cc <+60>:stq a3,128(sp) 0x020ed9d0 <+64>:stq a4,144(sp) 1048 static const apr_uint16_t endiancheck = 0xa55a; 1049 const svn_boolean_t arch_big_endian = 1050(((const char*))[sizeof(endiancheck) - 1] == '\x5a'); 1051 const svn_boolean_t swap_order = (!big_endian != !arch_big_endian); 1052 1053 apr_uint16_t lead_surrogate; 1054 apr_size_t length; 1055 apr_size_t offset; 1056 svn_membuf_t ucs4buf; 1057 svn_membuf_t resultbuf; 1058 svn_string_t *res; 1059 1060 if (utf16len == SVN_UTF__UNKNOWN_LENGTH) 0x020ed99c <+12>:lda t0,1(a2) 0x020ed9d4 <+68>:beq t0,0x20edbd01061{ 1062 const apr_uint16_t *endp = utf16str; 1063 while (*endp++) 0x020edbd0 <+576>: mov a1,s1 0x020edbd4 <+580>: unop 0x020edbe0 <+592>: lda s1,2(s1) 0x020edbe4 <+596>: lda t1,-2(s1) 0x020edbe8 <+600>: ldq_u t0,-2(s1) 0x020edbec <+604>: extwl t0,t1,t0 0x020edbf0 <+608>: bne t0,0x20edbe0 1064; In the loop immediately above (lines 1061--1064 of the source code) we see the code is walking through the utf16 string (register a1 holds argument 1, i.e., utf16str, which is first copied to register s1 and used as the pointer incremented in the loop). The load instructions to get the utf16 character are the two instructions: 0x020edbe8 <+600>: ldq_u t0,-2(s1) 0x020edbec <+604>: extwl t0,t1,t0 The ldq_u zeroes the lowest three bits of the address and loads the quadword (64-bit word) from the resultant address, and the extwl extracts the 16-bit word identified by the lowest three bits of the address (which are in register t1) from the quadword in register t0, however this is only correct if the lowest three bits are not 0x7. If the source string address has the lowest three bits
Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite
On Sun, May 01, 2016 at 09:37:50PM +1200, Michael Cree wrote: > subversion FTBFS on Alpha due to a test suite failure [1]: Yeah, I had noticed the failure but didn't have access to an alpha to investigate. > START: utf-test > PASS: lt-utf-test 1: test is_valid/last_valid > PASS: lt-utf-test 2: test last_valid/last_valid2 > PASS: lt-utf-test 3: test svn_utf_cstring_to_utf8_ex2 > PASS: lt-utf-test 4: test svn_utf_cstring_from_utf8_ex2 > PASS: lt-utf-test 5: test svn_utf__normcmp > PASS: lt-utf-test 6: test svn_utf__glob > PASS: lt-utf-test 7: test svn_utf__fuzzy_escape > PASS: lt-utf-test 8: test svn_utf__is_normalized > svn_tests: E235000: In file > '/«PKGBUILDDIR»/subversion/tests/libsvn_subr/utf-test.c' line 807: assertion > failed (0 == strcmp(result->data, tc->result)) > FAIL: lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8 > END: utf-test > ELAPSED: utf-test 0:00:00.603655 > > This failure occurs because the test suite allocates char aligned > strings in function test_utf_conversions() of file > subversion/tests/libsvn_subr/utf-test.c and passes those strings > to functions svn_utf__utf16_to_utf8() and svn_utf__utf32_to_utf8() > which are defined (in file subversion/libsvn_subr/utf.c) to take > apr_uint32_t * and apr_uint16_t * args respectively. Thanks for the analysis and working on a patch. I'll note that _Alignas, if it had worked, would likely have been too new a compiler feature to use. It looks like Debian's ia64 porterbox is still running, so I can try reproducing the problem there (using prctl to force SIGBUS) and then fiddling around with the code. Cheers, -- James GPG Key: 4096R/331BA3DB 2011-12-05 James McCoy
Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite
Source: subversion Version: 1.9.4-1 Severity: important Justification: fails to build from source (but built in the past) Tags: patch User: debian-al...@lists.debian.org Usertags: alpha subversion FTBFS on Alpha due to a test suite failure [1]: START: utf-test PASS: lt-utf-test 1: test is_valid/last_valid PASS: lt-utf-test 2: test last_valid/last_valid2 PASS: lt-utf-test 3: test svn_utf_cstring_to_utf8_ex2 PASS: lt-utf-test 4: test svn_utf_cstring_from_utf8_ex2 PASS: lt-utf-test 5: test svn_utf__normcmp PASS: lt-utf-test 6: test svn_utf__glob PASS: lt-utf-test 7: test svn_utf__fuzzy_escape PASS: lt-utf-test 8: test svn_utf__is_normalized svn_tests: E235000: In file '/«PKGBUILDDIR»/subversion/tests/libsvn_subr/utf-test.c' line 807: assertion failed (0 == strcmp(result->data, tc->result)) FAIL: lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8 END: utf-test ELAPSED: utf-test 0:00:00.603655 This failure occurs because the test suite allocates char aligned strings in function test_utf_conversions() of file subversion/tests/libsvn_subr/utf-test.c and passes those strings to functions svn_utf__utf16_to_utf8() and svn_utf__utf32_to_utf8() which are defined (in file subversion/libsvn_subr/utf.c) to take apr_uint32_t * and apr_uint16_t * args respectively. On architectures with strict alignment requirements this is undefined behaviour. On Alpha, gcc likes to add bugs..., I mean, _optimise_ code, when the args are declared to be aligned by using a particular CPU load instruction that neither traps nor gives correct results if the args are not actually aligned. I attach a patch that aligns the test strings in utf-test.c to be aligned to apr_uint32_t (i.e. 32-bit alignment). With that patch subversion builds successfully to completion on Alpha. Cheers Michael. [1] https://buildd.debian.org/status/fetch.php?pkg=subversion=alpha=1.9.4-1=1461928473 Index: subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c === --- subversion-1.9.4.orig/subversion/tests/libsvn_subr/utf-test.c 2016-05-01 21:05:58.763536825 +1200 +++ subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c2016-05-01 21:05:58.755722518 +1200 @@ -745,7 +745,7 @@ { svn_boolean_t sixteenbit; svn_boolean_t bigendian; -const char *source; +const _Alignas(apr_int32_t) char *source; const char *result; } tests[] = {