Thanks very much,

For versions of Tcl less than 8.6.11 we're failing because it's a new test
exposing an old problem, is that correct?
This would explain why I don't see any test failures after
building v4.99.22 with Tcl 8.6.9 - it's only because 4.99.23 has introduced
the tests and not that 4.99.22 doesn't have the problem.

To test Tcl8.6.11 the easiest way for me is to jump to bullseye (Debian
v11) which provides 8.6.11+dfsg-1

Strangely I seem to get the same ns_strcoll seg fault with that.
But if I remove misc.test temporarily, all other tests pass happily.

# uname -a
Linux ip-172-0-1-190 5.10.0-12-cloud-amd64 #1 SMP Debian 5.10.103-1
(2022-03-07) x86_64 GNU/Linux

# cat /etc/debian_version
11.3

# ls -l /lib/x86_64-linux-gnu/libc.so.6
lrwxrwxrwx 1 root root 12 Mar 17 21:37 /lib/x86_64-linux-gnu/libc.so.6 ->
libc-2.31.so

# apt-cache policy tcl8.6
tcl8.6:
  Installed: 8.6.11+dfsg-1

# git clone https://bitbucket.org/naviserver/naviserver.git
Cloning into 'naviserver'...

# cd naviserver
# git checkout tags/naviserver-4.99.23
Note: switching to 'tags/naviserver-4.99.23'.

# ./autogen.sh --with-tcl=/usr/lib/tcl8.6 --enable-rpath --enable-threads
--enable-symbols
# make

Compiler warning for reference:

gcc   -Wall -fPIC -g -O2
-fdebug-prefix-map=/build/tcl8.6-qxVr7a/tcl8.6-8.6.11+dfsg=.
-fstack-protector-strong -Wformat -Werror=format-security
-fno-unit-at-a-time -pipe -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG
-DSYSTEM_MALLOC -DTCL_NO_DEPRECATED -std=c99 -I../include
-I"/usr/include/tcl8.6"   -DHAVE_CONFIG_H   -c -o tclenv.o tclenv.c
In file included from /usr/include/string.h:495,
                 from ../include/nsthread.h:378,
                 from ../include/ns.h:46,
                 from nsd.h:38,
                 from tclenv.c:37:
In function ‘strncat’,
    inlined from ‘PutEnv’ at tclenv.c:349:13:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:136:10: warning:
‘__builtin_strncat’ specified bound depends on the length of the source
argument [ ]8;;
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-overflow=-Wstringop-overflow=
]8;;]
  136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos
(__dest));
      |
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tclenv.c: In function ‘PutEnv’:
tclenv.c:314:23: note: length computed here
  314 |         valueLength = strlen(value) + 1;
      |                       ^~~~~~~~~~~~~

# make memcheck TESTFLAGS="-verbose start -file misc.test"
---- ns_random-1.1 start
---- ns_fmttime-1.0 start
---- ns_fmttime-1.1 start
---- ns_trim-0.0 start
---- ns_trim-0.1 start
---- ns_trim-0.2 start
---- ns_trim-1.1 start
---- ns_trim-1.2 start
---- ns_trim-1.3 start
---- ns_trim-1.4 start
---- ns_trim-1.5 start
---- ns_trim-2.1 start
---- ns_trim-2.2 start
---- ns_quotehtml start
---- ns_strcoll-1.0.0 start
==37899== Thread 2:
==37899== Invalid read of size 8
==37899==    at 0x49E1361: strcoll_l (strcoll_l.c:260)
==37899==    by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802)
==37899==    by 0x4BBC4A1: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4BBD71F: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so)
==37899==    by 0x4C794D8: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4C818AD: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4B745AF: NsThreadMain (thread.c:232)
==37899==    by 0x4B75A48: ThreadMain (pthread.c:870)
==37899==    by 0x521BEA6: start_thread (pthread_create.c:477)
==37899==    by 0x4A4DDEE: clone (clone.S:95)
==37899==  Address 0x18 is not stack'd, malloc'd or (recently) free'd
==37899==
==37899==
==37899== Process terminating with default action of signal 11 (SIGSEGV)
==37899==  Access not within mapped region at address 0x18
==37899==    at 0x49E1361: strcoll_l (strcoll_l.c:260)
==37899==    by 0x48DA9FF: NsTclStrcollObjCmd (tclmisc.c:2802)
==37899==    by 0x4BBC4A1: TclNRRunCallbacks (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4BBD71F: ??? (in /usr/lib/x86_64-linux-gnu/libtcl8.6.so)
==37899==    by 0x4C794D8: Tcl_FSEvalFileEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4C818AD: Tcl_MainEx (in /usr/lib/x86_64-linux-gnu/
libtcl8.6.so)
==37899==    by 0x4B745AF: NsThreadMain (thread.c:232)
==37899==    by 0x4B75A48: ThreadMain (pthread.c:870)
==37899==    by 0x521BEA6: start_thread (pthread_create.c:477)
==37899==    by 0x4A4DDEE: clone (clone.S:95)
==37899==  If you believe this happened as a result of a stack
==37899==  overflow in your program's main thread (unlikely but
==37899==  possible), you can try to increase the size of the
==37899==  main thread stack using the --main-stacksize= flag.
==37899==  The main thread stack size used in this run was 8388608.
==37899==
==37899== HEAP SUMMARY:
==37899==     in use at exit: 12,499,085 bytes in 8,840 blocks
==37899==   total heap usage: 12,059 allocs, 3,219 frees, 29,314,210 bytes
allocated
==37899==
==37899== LEAK SUMMARY:
==37899==    definitely lost: 131 bytes in 1 blocks
==37899==    indirectly lost: 0 bytes in 0 blocks
==37899==      possibly lost: 10,466,879 bytes in 2,978 blocks
==37899==    still reachable: 2,032,075 bytes in 5,861 blocks
==37899==         suppressed: 0 bytes in 0 blocks
==37899== Rerun with --leak-check=full to see details of leaked memory
==37899==
==37899== For lists of detected and suppressed errors, rerun with: -s
==37899== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault
make: *** [Makefile:273: memcheck] Error 139



On Wed, 6 Apr 2022 at 16:58, Gustaf Neumann <neum...@wu.ac.at> wrote:

>
> On 06.04.22 16:46, David Osborne wrote:
>
>
> On Wed, 6 Apr 2022 at 14:53, Gustaf Neumann <neum...@wu.ac.at> wrote:
>
>> Hi David,
>>
>> i will setup a VM for testing in your configuration, but first i have to
>> understand, what pt1/pt2 means.
>>
>
>
> *Sorry that is just an abbreviation for "part1" and "part2" of a 2 part
> email. *
>
> ok, i thought there is a version called "Debian Buster pt1".... but could
> not find insights via googling :)
>
>
> *"tcl8.6" debian supplied package version 8.6.9+dfsg-2*
>
> This seems to be a part of the problem. Tcl 8.6.9 was released in nov 2018
> and has
> probably some issues with UTF-8 which were fixed in later releases.
>
> i have just now installed NaviServer on a fresh Debian Buster machine
> using my usual install script [1] (using Tcl 8.6.11) and everything looks
> ok. It is not unlikely that the problem with ns_strcoll is related, since
> one has to translate the "internal" UTF-8 to the external variant before
> calling "strcoll_l()", so, when this step is broken, then there might be
> some invalid memory around.
>
> For you, it would the best to use a newer version of Tcl. There are newer
> Debian packages of Tcl around...
>
>     https://packages.debian.org/search?keywords=tcl
>
> Is this an option for you?
>
> Not sure, how NaviServer could address the problem. Deactivating the
> ns_strcoll command in NaviServer when it is compiled with Tcl 8.6.9 or
> older, is probably no good option, since the UTF-to-external conversion is
> now all over the place and the problem will pop up at other places. We can
> consider deactivating the UTF-to-external conversion altogether for older
> Tcl version (requires several changes, including PostgreSQL driver) ... but
> the many tests will fail as well, which have to be deactivated as well.
>
> What do you think?
>
> -gn
>
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to