Hello. Dne 10. dubna 2012 22:29 Martin Sucha <such...@uniba.sk> napsal(a): > Hi Tobias, > > This seems to be useful functionality, but I have a few comments: > > On Tuesday 10 April 2012 19:09:01 Tobias Börtitz wrote: >> Over the above I added the letter_check(wchar_t) function to the libc. It >> checks whether the assigned character contains a printable character >> (a-zA-Z) or not. > First of all, HelenOS uses unicode strings so I think checking for (a-zA-Z) is > not correct. A word may start with different characters as well, e.g. > "čučoriedková" is a word that would not match the check. > > Secondly, I'm not familar with unicode, but I think there is no definition of > what a unicode "printable character" is - even some characters that are not > printable by themselves may be printed when used in a surrogate pair (correct > me if I'm wrong). What about checking whether a character is not a word > delimiter instead? > > As for what is a word delimiter, I'd suggest using more characters as word > delimiters (e.g. punctuation characters such as comma, em dash, etc. or > various types of spaces that are present in unicode). I agree. If you would replace in pt_is_word_beginning the last switch and the call to letter check with something like:
if (is_word_delimiter(end_str[0])) { ret = false; goto exit; } ret = is_word_delimiter(*prev_char); where is_word_delimiter() would check for space, tab, comma etc., your patch would correctly jump over words. At least, it would jump correctly over examples in /textdemo, regardless of the language. One more thing, Tobias - please, do put space between "if" and the condition. Cheers, - Vojta > > Regards, > Martin Sucha > > > _______________________________________________ > HelenOS-devel mailing list > HelenOS-devel@lists.modry.cz > http://lists.modry.cz/cgi-bin/listinfo/helenos-devel _______________________________________________ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/cgi-bin/listinfo/helenos-devel