Hi
ASan (address sanitizer) detects access to invalid memory
with vim-7.4.683 when doing:
$ vim -u NONE \
-c 'set re=1' \
-c 'e crash.txt' \
-c 'call search(getline("."))'
Where crash.txt is the attached file containing invalid utf8 sequences.
Here is the report from asan:
=================================================================
==5174== ERROR: AddressSanitizer: heap-buffer-overflow on address
0x606200019d00 at pc 0x6ed824 bp 0x7fff850bb440 sp 0x7fff850bb438
READ of size 1 at 0x606200019d00 thread T0
#0 0x6ed823 in regmatch /home/pel/sb/vim/src/regexp.c:4785
#1 0x6ea35f in regtry /home/pel/sb/vim/src/regexp.c:4098
#2 0x6e9f79 in bt_regexec_both /home/pel/sb/vim/src/regexp.c:3987
#3 0x6e960b in bt_regexec_multi /home/pel/sb/vim/src/regexp.c:3798
#4 0x726574 in vim_regexec_multi /home/pel/sb/vim/src/regexp.c:8273
#5 0x759596 in searchit /home/pel/sb/vim/src/search.c:733
#6 0x49f4eb in search_cmn /home/pel/sb/vim/src/eval.c:16330
#7 0x49fd88 in f_search /home/pel/sb/vim/src/eval.c:16480
#8 0x482d01 in call_func /home/pel/sb/vim/src/eval.c:8742
#9 0x481d81 in get_func_tv /home/pel/sb/vim/src/eval.c:8542
#10 0x46f898 in ex_call /home/pel/sb/vim/src/eval.c:3505
#11 0x5018c7 in do_one_cmd /home/pel/sb/vim/src/ex_docmd.c:2940
#12 0x4f868b in do_cmdline /home/pel/sb/vim/src/ex_docmd.c:1133
#13 0x4f7650 in do_cmdline_cmd /home/pel/sb/vim/src/ex_docmd.c:738
#14 0x8a8266 in exe_commands /home/pel/sb/vim/src/main.c:2906
#15 0x8a1c82 in main /home/pel/sb/vim/src/main.c:945
#16 0x7fd837ae0ec4 in __libc_start_main
/build/buildd/eglibc-2.19/csu/libc-start.c:287
#17 0x40e538 in _start ??:?
0x606200019d00 is located 0 bytes to the right of 4096-byte region
[0x606200018d00,0x606200019d00)
allocated by thread T0 here:
#0 0x7fd83a87241a in malloc ??:?
#1 0x614099 in lalloc /home/pel/sb/vim/src/misc2.c:926
#2 0x613e59 in alloc /home/pel/sb/vim/src/misc2.c:821
#3 0x8b3c3c in mf_alloc_bhdr /home/pel/sb/vim/src/memfile.c:952
#4 0x8b2052 in mf_new /home/pel/sb/vim/src/memfile.c:392 (discriminator 1)
#5 0x5c7d49 in ml_new_data /home/pel/sb/vim/src/memline.c:3545
#6 0x5b91c1 in ml_open /home/pel/sb/vim/src/memline.c:408
#7 0x40e78e in open_buffer /home/pel/sb/vim/src/buffer.c:98
#8 0x8a77bc in create_windows /home/pel/sb/vim/src/main.c:2679
#9 0x8a196a in main /home/pel/sb/vim/src/main.c:869
#10 0x7fd837ae0ec4 in __libc_start_main
/build/buildd/eglibc-2.19/csu/libc-start.c:287
Shadow bytes around the buggy address:
0x0c0cbfffb350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0cbfffb360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0cbfffb370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0cbfffb380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0cbfffb390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c0cbfffb3a0:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0cbfffb3b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0cbfffb3c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0cbfffb3d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0cbfffb3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0cbfffb3f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap righ redzone: fb
Freed Heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
ASan internal: fe
==5174== ABORTING
Attached patch fixes the bug, but I'm not 100% that
it's the correct way to deal with invalid utf8 regexp.
I found the bug using a fuzzer "american fuzzy lop"
available at http://lcamtuf.coredump.cx/afl/
Regards
Dominique
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.
����
diff -r 1f42458bf2e7 src/regexp.c
--- a/src/regexp.c Wed Mar 25 20:24:04 2015 +0100
+++ b/src/regexp.c Mon Mar 30 23:38:48 2015 +0200
@@ -4779,10 +4779,12 @@
opndc = mb_ptr2char(opnd);
if (enc_utf8 && utf_iscomposing(opndc))
{
+ int reginput_len = STRLEN(reginput);
+
/* When only a composing char is given match at any
* position where that composing char appears. */
status = RA_NOMATCH;
- for (i = 0; reginput[i] != NUL; i += utf_char2len(inpc))
+ for (i = 0; i < reginput_len; i += utf_char2len(inpc))
{
inpc = mb_ptr2char(reginput + i);
if (!utf_iscomposing(inpc))