On Fri, Nov 13, 2009 at 3:13 PM, Brian Wang <brian.wang.0...@gmail.com> wrote: > On Fri, Nov 13, 2009 at 8:12 PM, Gustavo Sverzut Barbieri > <barbi...@profusion.mobi> wrote: >> On Fri, Nov 13, 2009 at 4:49 AM, Brian Wang <brian.wang.0...@gmail.com> >> wrote: >>> [snip] >>>> OK. With the help of gdbserver/gdb, I am able to find where it's looping. >>>> However, I don't know what causes it. Here it goes: >>>> [svn r43601] >>>> evas/src/lib/canvas/evas_object_textblock.c:3152 >>> >>> The line number is off... I've put some of my stuff in the front of >>> the source code... My bad. >>> >>>> evas_object_textblock_text_markup_get() calls >>>> escape = _escaped_char_match(p, &adv); >>>> _escaped_char_match() returns "" and 'adv' is set to 0 >>>> thus the looping. >>>> >>>> Here is the gdb backtrace: >>>> -------------------------------------- >>>> (gdb) bt >>>> #0 _escaped_char_match (s=0x780a1 "追 ", adv=0xbec8c490) >>>> at evas_object_textblock.c:2732 >>>> #1 0x4008338c in evas_object_textblock_text_markup_get (obj=0x75380) >>>> at evas_object_textblock.c:3152 >>>> #2 0x40826be0 in _edje_part_recalc_single (ed=0x0, ep=0x0, desc=0x75ae0, >>>> chosen_desc=0x40300000, rel1_to_x=0x0, rel1_to_y=0x40884c2c, >>>> rel2_to_x=0x3, rel2_to_y=0x751c0, confine_to=0x0, params=0x749ac, >>>> flags=3) >>>> at edje_calc.c:651 >>>> #3 0x408271c8 in _edje_part_recalc (ed=0x10, ep=0x748bc, flags=1) >>>> at edje_calc.c:1721 >>>> #4 0x40828e10 in _edje_recalc_do (ed=0x780a1) at edje_calc.c:224 >>>> #5 0x408473b4 in edje_object_size_min_restricted_calc ( >>>> obj=<value optimized out>, minw=0x0, minh=0x4025e27c, restrictedw=0, >>>> restrictedh=-1) at edje_util.c:2362 >>>> #6 0x40847618 in edje_object_size_min_calc (obj=0x780a1, minw=0xbec8c490, >>>> minh=0x40144860) at edje_util.c:2311 >>>> #7 0x40210a6c in _sizing_eval (obj=0x5d060) at elm_label.c:55 >>>> #8 0x40210e10 in elm_label_label_set (obj=0x5d060, label=0x8c88 " 追 ") >>>> at elm_label.c:121 >>>> #9 0x00008b1c in elm_main (argc=1, argv=0xbec8cd04) at >>>> elm-label-bug-test.c:36 >>>> #10 0x00008bac in main (argc=1, argv=0xbec8cd04) at elm-label-bug-test.c:64 >>>> -------------------------------------- >>>> >>>> I don't know what's special about the string that makes it end up the >>>> condition. >>>> Checking if (strlen(escape)==0 && adv==0) seems to terminate the loop. >>>> But I totally have no clue what's going on here... Fixing it without >>>> knowing what's causing the condition is wrong. >>>> >>>> The string "追 " in UTF-8 is of value: 0xe8 0xbf 0xbd 0x20 >>>> >>>> Is the info above enough to track down the problem? >>> >>> I've tracked it down a bit. >>> On x86, via gdb, >>> ------------------------ >>> (gdb) p escape_strings[sizeof(escape_strings)] >>> $4 = 102 'f' >>> ------------------------ >>> >>> On my arm, via gdb, >>> ------------------------ >>> (gdb) p escape_strings[sizeof(escape_strings)] >>> $12 = 0 '\0' >>> ------------------------ >> >> As you said below, it is invalid read. >> >> >>> At the very last check of >>> while ((*mc) && (*sc)), >>> map_itr is equal to map_end and the difference is the on x86 *mc is >>> non-zero and on my arm, *mc is zero. >>> Therefore, on arm, the while loop is skipped all together and 'match' >>> is still 1. And hence the looping... >>> >>> sizeof(escape_strings) on both platforms are the same: 1551. >>> I would say that it is invalid read for both cases. The x86 case gets >>> lucky and got away with it. >>> I'm surprised that valgrind did not catch this. Or maybe I'm wrong... >>> char a[] = "s"; >>> sizeof(a) == 2 >>> accessing a[2] is out of bounds >> >> this is correct. As why it does not warn, maybe there is another valid >> string allocated right after, thus you end reading something valid and >> valgrind will not warn you. > > I guess so too. I thought valgrind is pretty good at catching > out-of-bounds accesses..
well, there are some flags to make GCC produce code with garbage before and after arrays, I guess it's fortify source or something like that. mudflap should help as well. >>> We may check if (map_itr < map_end) after the first >>> _advance_after_end_of_string() inside _escaped_char_match() >>> or we may decrease map_end by 1 since every escape character is >>> already terminated by a null character. >>> Thus, my proposed patch (sorry for the bad part at the front, which is >>> only for my own usage): >>> ------------------------------------------------ >>> Index: src/lib/canvas/evas_object_textblock.c >>> =================================================================== >>> --- src/lib/canvas/evas_object_textblock.c (revision 43601) >>> +++ src/lib/canvas/evas_object_textblock.c (working copy) >>> @@ -1608,6 +1608,18 @@ >>> _layout_word_start(char *str, int start) >>> { >>> int p, tp, chr = 0; >>> + >>> +#if 1 >>> + // >>> + // coolbrian: break if the word is not within the ASCII range >>> + // @note This is good for breaking up Chinese words, which are >>> made of Chinese characters. >>> + // Chinese characters do not look strange if they are not >>> grouped to form a 'phrase'. >>> + // Also, Chinese 'phrases' are of too many patterns and >>> probably need a dictionary to look up. >>> + // That would be too much. >>> + // >>> + if (((unsigned char)str[start]) >= 0x80) >>> + return start; >>> +#endif >>> >>> p = start; >>> chr = evas_common_font_utf8_get_next((unsigned char *)(str), &p); >>> @@ -2709,7 +2721,7 @@ >>> const char *map_itr, *map_end, *mc, *sc; >>> >>> map_itr = escape_strings; >>> - map_end = map_itr + sizeof(escape_strings); >>> + map_end = map_itr + sizeof(escape_strings) - 1; >>> >>> while (map_itr < map_end) >> >> It's weird: >> >> map_end = map_itr + sizeof(escape_strings) >> >> is fine, as we compare for less than, so this address would never be >> used. The last run should be on escape_strings: >> "⊥\0" "\xe2\x8a\xa5\0" >> >> Thus: >> >> while (map_itr < map_end) >> >> --> yes, we're at map_itr = ⊥\0 >> >> >> { >> const char *escape; >> int match; >> >> escape = map_itr; >> >> --> escape = ⊥\0 >> >> >> _advance_after_end_of_string(&map_itr); >> mc = map_itr; >> >> --> map_itr = \xe2\x8a\xa5\0 >> >> >> sc = s; >> match = 1; >> while ((*mc) && (*sc)) >> { >> >> --> enters >> >> >> if ((unsigned char)*sc < (unsigned char)*mc) return NULL; >> >> --> I have no clue why this > > I don't either... > >> >> >> if (*sc != *mc) match = 0; >> >> --> fails, thus match = 0 >> >> >> mc++; >> sc++; >> } >> if (match) >> >> --> skipped, as match = 0 >> >> { >> *adv = mc - map_itr; >> return escape; >> } >> _advance_after_end_of_string(&map_itr); >> >> --> map_itr was \xe2\x8a\xa5\0, thus it's now one byte after, thus >> map_itr == escape_strings + sizeof(escape_strings) and thus the next >> while (map_itr < map_end) will fail. > > No. map_itr == escape_strings + sizeof(escape_strings) -1 > Thus the next while (map_itr < map_end) will pass. So that's the bug. If map_itr was \xe2\x8a\xa5\0 and _advance_after_end_of_string(&map_itr); looks for the first \0 (the last char) then increments one, we clearly should be correct! :-( > If we simplify the string a bit: > char escape_strings[] = "\xe2\x8a\0"; // sizeof it == 4 > say escape_strings is at address 0x0 > map_itr = 0; > map_end = 0 + 4 = 4; > _advance_after_end_of_string(&map_itr); > map_itr=3; what?! how advance after of string is returning 3??? it should read 3 times, stop since it found 0, then increment one... please debug that one :-) -- Gustavo Sverzut Barbieri http://profusion.mobi embedded systems -------------------------------------- MSN: barbi...@gmail.com Skype: gsbarbieri Mobile: +55 (19) 9225-2202 ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel