Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite

2018-03-09 Thread Michael Cree
On Fri, Mar 02, 2018 at 11:32:44PM -0500, James McCoy wrote:
> Thanks (again) for the very thorough analysis.  Sorry it's taken so long
> for me to circle back to this.
> 
> If I understand correctly, the easiest fix would be to memcpy()
> tc->source to an appropriately sized apr_uint16_t/apr_int32_t array.
> That could then be used for the svn_utf__utf{16,32}_to_utf8() calls.
> 
> If so, could you try the attached patch?

Yes, with that patch subversion builds successfully on Alpha.

Cheers
Michael.



Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite

2018-03-02 Thread James McCoy
On Mon, May 02, 2016 at 08:47:03PM +1200, Michael Cree wrote:
> On Mon, May 02, 2016 at 12:01:18AM -0400, James McCoy wrote:
> > Thanks for the analysis and working on a patch.  I'll note that
> > _Alignas, if it had worked, would likely have been too new a compiler
> > feature to use.
> > 
> > It looks like Debian's ia64 porterbox is still running, so I can try
> > reproducing the problem there (using prctl to force SIGBUS) and then
> > fiddling around with the code.
> 
> Here goes further analysis.  Running utf-test under gdb I set a break
> point at line 807 of utf-test.c and after a number of iterations of
> the enclosing loop I get:
> 
> Breakpoint 1, test_utf_conversions (pool=0x20001018028)
> at 
> /extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c:807
> 807 SVN_ERR_ASSERT(0 == strcmp(result->data, tc->result));
> (gdb) print tc->source
> $20 = 0x120003117 "\330\064\335\036"
> (gdb) print result->data
> $21 = 0x20001018218 "\360\220\204\236"
> (gdb) print tc->result
> $22 = 0x12000373c "\360\235\204\236"
> (gdb) cont
> Continuing.
> svn_tests: E235000: In file 
> '/extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c'
>  line 807: assertion failed (0 == strcmp(result->data, tc->result))
> FAIL:  lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8
> [Inferior 1 (process 24827) exited with code 01]
> 
> 
> 
> On this iteration I believe it has just read line 769 of utf-test.c,
> i.e., this line:
> 
> { UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* 
> U+01D11E */
> 
> into variable tc, and it passes tc->source (which is at address
> 0x120003117, i.e., misaligned for apr_uint16_t type) to
> svn_utf__utf16_to_utf8() and the result is incorrect and results in
> the failed assert.

Thanks (again) for the very thorough analysis.  Sorry it's taken so long
for me to circle back to this.

If I understand correctly, the easiest fix would be to memcpy()
tc->source to an appropriately sized apr_uint16_t/apr_int32_t array.
That could then be used for the svn_utf__utf{16,32}_to_utf8() calls.

If so, could you try the attached patch?

Cheers,
-- 
James
GPG Key: 4096R/91BF BF4D 6956 BD5D F7B7  2D23 DFE6 91AE 331B A3DB
Index: trunk/subversion/tests/libsvn_subr/utf-test.c
===
--- trunk/subversion/tests/libsvn_subr/utf-test.c	(revision 1825748)
+++ trunk/subversion/tests/libsvn_subr/utf-test.c	(working copy)
@@ -752,8 +752,10 @@
   {
 svn_boolean_t sixteenbit;
 svn_boolean_t bigendian;
+apr_size_t sourcelen;
 const char *source;
 const char *result;
+svn_boolean_t counted;
   } tests[] = {
 
 #define UTF_32_LE FALSE, FALSE
@@ -762,33 +764,37 @@
 #define UTF_16_BE TRUE, TRUE
 
 /* Normal character conversion */
-{ UTF_32_LE, "t\0\0\0" "e\0\0\0" "s\0\0\0" "t\0\0\0" "\0\0\0\0", "test" },
-{ UTF_32_BE, "\0\0\0t" "\0\0\0e" "\0\0\0s" "\0\0\0t" "\0\0\0\0", "test" },
-{ UTF_16_LE, "t\0" "e\0" "s\0" "t\0" "\0\0", "test" },
-{ UTF_16_BE, "\0t" "\0e" "\0s" "\0t" "\0\0", "test" },
+{ UTF_32_LE, 4, "t\0\0\0" "e\0\0\0" "s\0\0\0" "t\0\0\0" "\0\0\0\0", "test", FALSE },
+{ UTF_32_BE, 4, "\0\0\0t" "\0\0\0e" "\0\0\0s" "\0\0\0t" "\0\0\0\0", "test", FALSE },
+{ UTF_16_LE, 4, "t\0" "e\0" "s\0" "t\0" "\0\0", "test", FALSE },
+{ UTF_16_BE, 4, "\0t" "\0e" "\0s" "\0t" "\0\0", "test", FALSE },
 
 /* Valid surrogate pairs */
-{ UTF_16_LE, "\x00\xD8" "\x00\xDC" "\0\0", "\xf0\x90\x80\x80" }, /* U+01 */
-{ UTF_16_LE, "\x34\xD8" "\x1E\xDD" "\0\0", "\xf0\x9d\x84\x9e" }, /* U+01D11E */
-{ UTF_16_LE, "\xFF\xDB" "\xFD\xDF" "\0\0", "\xf4\x8f\xbf\xbd" }, /* U+10FFFD */
+{ UTF_16_LE, 2, "\x00\xD8" "\x00\xDC" "\0\0", "\xf0\x90\x80\x80", FALSE }, /* U+01 */
+{ UTF_16_LE, 2, "\x34\xD8" "\x1E\xDD" "\0\0", "\xf0\x9d\x84\x9e", FALSE }, /* U+01D11E */
+{ UTF_16_LE, 2, "\xFF\xDB" "\xFD\xDF" "\0\0", "\xf4\x8f\xbf\xbd", FALSE }, /* U+10FFFD */
 
-{ UTF_16_BE, "\xD8\x00" "\xDC\x00" "\0\0", "\xf0\x90\x80\x80" }, /* U+01 */
-{ UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* U+01D11E */
-{ UTF_16_BE, "\xDB\xFF" "\xDF\xFD" "\0\0", "\xf4\x8f\xbf\xbd" }, /* U+10FFFD */
+{ UTF_16_BE, 2, "\xD8\x00" "\xDC\x00" "\0\0", "\xf0\x90\x80\x80", FALSE }, /* U+01 */
+{ UTF_16_BE, 2, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e", FALSE }, /* U+01D11E */
+{ UTF_16_BE, 2, "\xDB\xFF" "\xDF\xFD" "\0\0", "\xf4\x8f\xbf\xbd", FALSE }, /* U+10FFFD */
 
 /* Swapped, single and trailing surrogate pairs */
-{ UTF_16_LE, "*\0" "\x00\xDC" "\x00\xD8" "*\0\0\0", "*\xed\xb0\x80" "\xed\xa0\x80*" },
-{ UTF_16_LE, "*\0" "\x1E\xDD" "*\0\0\0", "*\xed\xb4\x9e*" },
-{ UTF_16_LE, "*\0" "\xFF\xDB" "*\0\0\0", "*\xed\xaf\xbf*" },
-{ UTF_16_LE, "\x1E\xDD" "\0\0", "\xed\xb4\x9e" },
-{ UTF_16_LE, "\xFF\xDB" "\0\0", "\xed\xaf\xbf" },
+{ UTF_16_LE, 4, "*\0" "\x00\xDC" "\x00\xD8" 

Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite

2016-05-02 Thread Michael Cree
On Mon, May 02, 2016 at 12:01:18AM -0400, James McCoy wrote:
> Thanks for the analysis and working on a patch.  I'll note that
> _Alignas, if it had worked, would likely have been too new a compiler
> feature to use.
> 
> It looks like Debian's ia64 porterbox is still running, so I can try
> reproducing the problem there (using prctl to force SIGBUS) and then
> fiddling around with the code.

Here goes further analysis.  Running utf-test under gdb I set a break
point at line 807 of utf-test.c and after a number of iterations of
the enclosing loop I get:

Breakpoint 1, test_utf_conversions (pool=0x20001018028)
at 
/extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c:807
807   SVN_ERR_ASSERT(0 == strcmp(result->data, tc->result));
(gdb) print tc->source
$20 = 0x120003117 "\330\064\335\036"
(gdb) print result->data
$21 = 0x20001018218 "\360\220\204\236"
(gdb) print tc->result
$22 = 0x12000373c "\360\235\204\236"
(gdb) cont
Continuing.
svn_tests: E235000: In file 
'/extra/mjc/debian/subversion/subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c'
 line 807: assertion failed (0 == strcmp(result->data, tc->result))
FAIL:  lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8
[Inferior 1 (process 24827) exited with code 01]



On this iteration I believe it has just read line 769 of utf-test.c,
i.e., this line:

{ UTF_16_BE, "\xD8\x34" "\xDD\x1E" "\0\0", "\xf0\x9d\x84\x9e" }, /* 
U+01D11E */

into variable tc, and it passes tc->source (which is at address
0x120003117, i.e., misaligned for apr_uint16_t type) to
svn_utf__utf16_to_utf8() and the result is incorrect and results in
the failed assert.

On previous iterations of the test loop it happened to pass the test
even with misaligned tc->source strings, and it is instructive to
disassemble svn_utf__utf16_to_utf8() to see why this can happen:

1040svn_error_t *
1041svn_utf__utf16_to_utf8(const svn_string_t **result,
1042   const apr_uint16_t *utf16str,
1043   apr_size_t utf16len,
1044   svn_boolean_t big_endian,
1045   apr_pool_t *result_pool,
1046   apr_pool_t *scratch_pool)
1047{
   0x020ed990 <+0>: ldahgp,12(t12)
   0x020ed994 <+4>: lda gp,10976(gp)
   0x020ed998 <+8>: lda sp,-160(sp)
   0x020ed9a0 <+16>:stq s0,8(sp)
   0x020ed9a4 <+20>:mov a1,s0
   0x020ed9a8 <+24>:stq s1,16(sp)
   0x020ed9ac <+28>:mov a2,s1
   0x020ed9b0 <+32>:stq ra,0(sp)
   0x020ed9b4 <+36>:stq s2,24(sp)
   0x020ed9b8 <+40>:stq s3,32(sp)
   0x020ed9bc <+44>:stq s4,40(sp)
   0x020ed9c0 <+48>:stq s5,48(sp)
   0x020ed9c4 <+52>:stq fp,56(sp)
   0x020ed9c8 <+56>:stq a0,152(sp)
   0x020ed9cc <+60>:stq a3,128(sp)
   0x020ed9d0 <+64>:stq a4,144(sp)

1048  static const apr_uint16_t endiancheck = 0xa55a;
1049  const svn_boolean_t arch_big_endian =
1050(((const char*))[sizeof(endiancheck) - 1] == '\x5a');
1051  const svn_boolean_t swap_order = (!big_endian != !arch_big_endian);
1052
1053  apr_uint16_t lead_surrogate;
1054  apr_size_t length;
1055  apr_size_t offset;
1056  svn_membuf_t ucs4buf;
1057  svn_membuf_t resultbuf;
1058  svn_string_t *res;
1059
1060  if (utf16len == SVN_UTF__UNKNOWN_LENGTH)
   0x020ed99c <+12>:lda t0,1(a2)
   0x020ed9d4 <+68>:beq t0,0x20edbd0 


1061{
1062  const apr_uint16_t *endp = utf16str;
1063  while (*endp++)
   0x020edbd0 <+576>:   mov a1,s1
   0x020edbd4 <+580>:   unop
   0x020edbe0 <+592>:   lda s1,2(s1)
   0x020edbe4 <+596>:   lda t1,-2(s1)
   0x020edbe8 <+600>:   ldq_u   t0,-2(s1)
   0x020edbec <+604>:   extwl   t0,t1,t0
   0x020edbf0 <+608>:   bne t0,0x20edbe0 


1064;


In the loop immediately above (lines 1061--1064 of the source code)
we see the code is walking through the utf16 string (register a1 holds
argument 1, i.e., utf16str, which is first copied to register s1 and
used as the pointer incremented in the loop).  The load instructions
to get the utf16 character are the two instructions:

   0x020edbe8 <+600>:   ldq_u   t0,-2(s1)
   0x020edbec <+604>:   extwl   t0,t1,t0

The ldq_u zeroes the lowest three bits of the address and loads the
quadword (64-bit word) from the resultant address, and the extwl
extracts the 16-bit word identified by the lowest three bits of the
address (which are in register t1) from the quadword in register t0,
however this is only correct if the lowest three bits are not 0x7.

If the source string address has the lowest three bits 

Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite

2016-05-01 Thread James McCoy
On Sun, May 01, 2016 at 09:37:50PM +1200, Michael Cree wrote:
> subversion FTBFS on Alpha due to a test suite failure [1]:

Yeah, I had noticed the failure but didn't have access to an alpha to
investigate.

> START: utf-test
> PASS:  lt-utf-test 1: test is_valid/last_valid
> PASS:  lt-utf-test 2: test last_valid/last_valid2
> PASS:  lt-utf-test 3: test svn_utf_cstring_to_utf8_ex2
> PASS:  lt-utf-test 4: test svn_utf_cstring_from_utf8_ex2
> PASS:  lt-utf-test 5: test svn_utf__normcmp
> PASS:  lt-utf-test 6: test svn_utf__glob
> PASS:  lt-utf-test 7: test svn_utf__fuzzy_escape
> PASS:  lt-utf-test 8: test svn_utf__is_normalized
> svn_tests: E235000: In file 
> '/«PKGBUILDDIR»/subversion/tests/libsvn_subr/utf-test.c' line 807: assertion 
> failed (0 == strcmp(result->data, tc->result))
> FAIL:  lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8
> END: utf-test
> ELAPSED: utf-test 0:00:00.603655
> 
> This failure occurs because the test suite allocates char aligned
> strings in function test_utf_conversions() of file
> subversion/tests/libsvn_subr/utf-test.c and passes those strings
> to functions svn_utf__utf16_to_utf8() and svn_utf__utf32_to_utf8()
> which are defined (in file subversion/libsvn_subr/utf.c) to take
> apr_uint32_t * and apr_uint16_t * args respectively.

Thanks for the analysis and working on a patch.  I'll note that
_Alignas, if it had worked, would likely have been too new a compiler
feature to use.

It looks like Debian's ia64 porterbox is still running, so I can try
reproducing the problem there (using prctl to force SIGBUS) and then
fiddling around with the code.

Cheers,
-- 
James
GPG Key: 4096R/331BA3DB 2011-12-05 James McCoy 



Bug#823133: subversion FTBFS on Alpha: misaligned strings in the test suite

2016-05-01 Thread Michael Cree
Source: subversion
Version: 1.9.4-1
Severity: important
Justification: fails to build from source (but built in the past)
Tags: patch
User: debian-al...@lists.debian.org
Usertags: alpha

subversion FTBFS on Alpha due to a test suite failure [1]:

START: utf-test
PASS:  lt-utf-test 1: test is_valid/last_valid
PASS:  lt-utf-test 2: test last_valid/last_valid2
PASS:  lt-utf-test 3: test svn_utf_cstring_to_utf8_ex2
PASS:  lt-utf-test 4: test svn_utf_cstring_from_utf8_ex2
PASS:  lt-utf-test 5: test svn_utf__normcmp
PASS:  lt-utf-test 6: test svn_utf__glob
PASS:  lt-utf-test 7: test svn_utf__fuzzy_escape
PASS:  lt-utf-test 8: test svn_utf__is_normalized
svn_tests: E235000: In file 
'/«PKGBUILDDIR»/subversion/tests/libsvn_subr/utf-test.c' line 807: assertion 
failed (0 == strcmp(result->data, tc->result))
FAIL:  lt-utf-test 9: test svn_utf__utf{16,32}_to_utf8
END: utf-test
ELAPSED: utf-test 0:00:00.603655

This failure occurs because the test suite allocates char aligned
strings in function test_utf_conversions() of file
subversion/tests/libsvn_subr/utf-test.c and passes those strings
to functions svn_utf__utf16_to_utf8() and svn_utf__utf32_to_utf8()
which are defined (in file subversion/libsvn_subr/utf.c) to take
apr_uint32_t * and apr_uint16_t * args respectively.  On
architectures with strict alignment requirements this is undefined
behaviour.  On Alpha, gcc likes to add bugs..., I mean, _optimise_
code, when the args are declared to be aligned by using a particular
CPU load instruction that neither traps nor gives correct results
if the args are not actually aligned.

I attach a patch that aligns the test strings in utf-test.c to be
aligned to apr_uint32_t (i.e. 32-bit alignment).  With that patch
subversion builds successfully to completion on Alpha.

Cheers
Michael.

[1] 
https://buildd.debian.org/status/fetch.php?pkg=subversion=alpha=1.9.4-1=1461928473
Index: subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c
===
--- subversion-1.9.4.orig/subversion/tests/libsvn_subr/utf-test.c   
2016-05-01 21:05:58.763536825 +1200
+++ subversion-1.9.4/subversion/tests/libsvn_subr/utf-test.c2016-05-01 
21:05:58.755722518 +1200
@@ -745,7 +745,7 @@
   {
 svn_boolean_t sixteenbit;
 svn_boolean_t bigendian;
-const char *source;
+const _Alignas(apr_int32_t) char *source;
 const char *result;
   } tests[] = {