Hi Romick,

FWIW, iswalnum() returns zero for those special characters as they are in 
the private use area of Unicode characters. My colleagues have tested this 
across multiple platforms.

So, I don't think that's the problem.

Thanks for the suggestion though!

David Adam
[email protected]

On Sat, 28 May 2016, Romick wrote:
> On Thu, May 26, 2016 at 06:58:12AM +0800, David Adam wrote:
> > Hello,
> > 
> > I help work on fish-shell, an alternative command-line shell. Recently 
> > we've had some reports of strange behaviour on newer versions of DragonFly 
> > BSD, which as far as I can tell come down to unusual behaviours of wide 
> > character functions in the UTF-8 locale.
> 
> I looked at the parser in fish-shell, you use special characters directly
> in the input stream to mark different things, such as BRACKET_BEGIN,
> BRACKET_END, BRACKET_SEP, INTERNAL_SEPARATOR and so on.
> 
> This is fine until you have met the locale in which the characters are
> full members of the alphabet.
> You see, Unicode range is 0x0 to 0x10FFFF, and character
> INTERNAL_SEPARATOR has a code of 0xFDD7.  
> 
> In DragonFly BSD function iswalnum() checks all locales simultaneously, so
> that you have three choices:
> 1) use your own iswalnum():
> ===
> diff --git a/src/common.h b/src/common.h
> index e59dfc0..e8c01c3 100644
> --- a/src/common.h
> +++ b/src/common.h
> @@ -769,4 +769,8 @@ __attribute__((noinline)) void debug_thread_error(void);
>  /// specified base, return -1.
>  long convert_digit(wchar_t d, int base);
>  
> +inline int iswalnum(wchar_t chr) {
> +     return((chr >= L'a' && chr <= L'z') || (chr >= L'A' && chr <= L'Z') || 
> iswdigit(chr));
> +}
> +
>  #endif
> ===
> 
> 2) use bigger values for your special characters (I have not tested this).
> 
> 3) something else:)

Reply via email to