Re: Patch 7.4.1783

Bram Moolenaar Tue, 26 Apr 2016 12:38:41 -0700

Kazunobu Kuriyama wrote:

> > > 2016-04-25 5:03 GMT+09:00 Bram Moolenaar <[email protected]>:
> > > > I do not see a clue why this would be different on OS/X.
> > >
> > > As the failure message above indicates, it looks the functions isalpha(),
> > > isalnum() and ispunct() of OS X accept a wider range of 8-bit characters
> > as
> > > class members.  In other words, in contrast to Linux, these functions
> > don't
> > > assume the standard C locale to determine their behaviors.
> > >
> > > While Linux's man page talks about the C locale (
> > > http://linux.die.net/man/3/isalpha), OS X's man page doesn't mention
> > about
> > > it (
> > >
> > https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/isalpha.3.html
> > > ).
> > >
> > > Actually, when I ran the test like this:
> > >
> > >     $ LC_CTYPE=C make test_alot_utf8
> > >
> > > then the test succeeded.
> > >
> > > So, I feel we need to add something like this to test_regexp_utf8.vim
> > > (please see the attached patch for details, because it contains a long
> > > string):
> > >
> > > if has('osx')
> > >   lang ctype C
> > > endif
> > >
> > > But I'd rather like to wait for a day or two for someone with a better
> > > explanation and solution :-)
> >
> > Well, that may fix the test, but the regexp behavior will still differ
> > between systems.  I rather avoid that.  Otherwise some plugins might
> > break on OS/X (and lots of people won't have a chance to try it out).
> >
> 
> That's been also my concern and I've been looking for another solution
> since then :)
> 
> TL;DR.  Hopefully, the attached patch fixes the issue.
> 
> After sending my previous email, I made a small C program to mimic
> test_regexp_utf8.vim and examined the behavior of those ctype functions ---
> differences of the resulting character classes and their locale dependency.
> 
> Having done that, I concluded that the test failure indeed came from
> behavioral difference of the ctype functions, and found simple set theory
> operations were enough to solve the problem.
> 
> So I added some extra conditions to some of the `IF` statements in regexp.c
> and regexp_nfa.c where isalpha(), islower(), isalnum() and ispunct() were
> called, so that the resulting character classes would match what vim
> expected.
> 
> I tested the patch with some different locales, and confirmed it worked on
> my os x.
> 
> (If you want to examine the test program source code and it's raw output
> data, just let me know.  I'll send you them with another email.)


Thanks.  I think we can leave out the #ifdef and use "< 128" instead of
isascii().

It appears other programs say that what is matched depends on the
locale.  Although that can be useful, it's a nasty dependency, because
the locale is global to the whole program.  What if you have two files
in a different locale?  I don't think there is an isalpha() function
that takes a locale argument.

-- 
Save the plankton - eat a whale.

 /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\  an exciting new programming language -- http://www.Zimbu.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Patch 7.4.1783

Raspunde prin e-mail lui