Re: tr A-Z a-z in locales other than C

2011-06-07 Thread Andrey Chernov
On Tue, Jun 07, 2011 at 11:17:12PM +0200, Jilles Tjoelker wrote: > In FreeBSD, upper case sorts before lower case, so cases can be > distinguished this way but all letters may require two ranges. In most > other operating systems the cases go together so a single range is > sufficient, but cases ca

Re: tr A-Z a-z in locales other than C

2011-06-06 Thread Andrey Chernov
On Tue, Jun 07, 2011 at 12:41:05AM +0200, Jilles Tjoelker wrote: > > There is a related issue with ranges in regular expressions, glob and > fnmatch (likewise unspecified by POSIX outside the POSIX locale), but > this is less likely to cause problems. > You care about ports, but suggested change

Re: CFT: BSD grep

2008-08-26 Thread Andrey Chernov
On Tue, Aug 26, 2008 at 08:25:01PM +0200, Gabor Kovesdan wrote: > Hello all, > > I've reviewed BSD grep based on your comments and the bug reports I > received. The new version is committed to the ports tree as > textproc/bsdgrep and there is a base patch available: > http://kovesdan.org/patches

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-07 Thread Andrey Chernov
On Mon, Jul 07, 2008 at 10:06:31PM +0200, Kris Kennaway wrote: > What regression suites do other implementations have? e.g. the GNU > textutils. They basically have regex tests, but nothing locale specific, since locale ordering is different from platform to platform (until Unicode Collation A

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-24 Thread Andrey Chernov
On Wed, Jun 25, 2008 at 01:04:20AM +0400, Andrey Chernov wrote: > > if ((s = mbstowcs(NULL, f->base, 0)) == -1) > > return (0); > > The same here. Check EILSEQ and return 1 BTW, do you realyze that this code malloc()s _whole_file_ into memory (which

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-24 Thread Andrey Chernov
On Tue, Jun 24, 2008 at 10:32:17PM +0200, Gabor Kovesdan wrote: > ch = fgetwc(f); You must clear errno before and handle EILSEQ possible coming after fgetwc() somehow. Perhaps by return ret = 1 (binary), I am not sure. fgetwc() returns WEOF in that case which is not true end of f

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-22 Thread Andrey Chernov
On Sun, Jun 22, 2008 at 02:58:17PM +0200, Gabor Kovesdan wrote: > Andrey Chernov escribi?: > > On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: > > > >> For grep, I believe it should simply be a matter of calling setlocale(), > >> using wi

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 11:14:16AM +0200, Konrad Jankowski wrote: > I think the best place for this type of information is currently my SoC > wiki. > http://wiki.freebsd.org/KonradJankowski/Collation > I know currently it has very little information, however. > I can also create another page dedic

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: > For grep, I believe it should simply be a matter of calling setlocale(), > using wide strings, and using a multibyte regex engine (for appropriate > values of "simply"). See my prev reply telling more details. Using wide strin

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 11:39:10AM +0200, Dag-Erling Sm??rgrav wrote: > Does that mean our wcsxfrm() doesn't work? IIUC, it should convert > wide strings to strings that can be compared directly with strcmp()? (directly with wcscmp()) For single byte locales wcsxfrm() and wcscoll() works, but for

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 10:22:31AM +0200, Dag-Erling Sm??rgrav wrote: > I think part of the problem is that there aren't enough people who truly > understand localization. I think I understand most of it, but I'm > pretty sure I *don't* understand how collation works, or is supposed to > work. Am

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 12:58:12PM +0200, Gabor Kovesdan wrote: > >> Yes, and once this is done, sort will work out of he box, if it uses > >> strcoll. Already tried on a prototype. > >> > > > > Only GNU sort for multibyte chars. BSD sort is programmed too badly and > > can't be fixed even f

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 10:54:42AM +0200, Konrad Jankowski wrote: > Diomidis Spinellis wrote: > > Gabor Kovesdan wrote: > >> In case of sort, I understarnd that it should explicitly handle wide > >> characters due to the different alphabet of the different languages > >> and yes, that seems to be

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 12:08:38PM +0200, Dag-Erling Sm??rgrav wrote: > I hadn't noticed... ISTR it was an issue back when jphoward wrote his > BSD-licensed grep. BSD grep have enough (but not fatal, as BSD sort) problems even with single byte locales we support initially in our regex (old pre-m

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 11:46:07AM +0400, Andrey Chernov wrote: > On Tue, Jun 17, 2008 at 09:21:52AM +0200, Gabor Kovesdan wrote: > > Sorry for the possibly silly question, but what we mean localization > > here in the case of grep? As far as I see, it works with wide chars,

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 09:21:52AM +0200, Gabor Kovesdan wrote: > Sorry for the possibly silly question, but what we mean localization > here in the case of grep? As far as I see, it works with wide chars, > because the regex library is aware of those. What other aspect needs to > be taken into

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 04:28:10AM +0400, Andrey Chernov wrote: > BSD grep is even not bothering to call setlocale(). I can't say is it can > be simple healed by adding that call, some test suite run is needed. Quick source inspection reveals that BSD grep operates with single

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 04:22:25AM +0400, Andrey Chernov wrote: > On Mon, Jun 16, 2008 at 02:36:23PM +0200, Dag-Erling Sm??rgrav wrote: > > > > Please note that BSD grep is not localized (and can't be per design) > > > > and works only with standard C locale. It may

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Mon, Jun 16, 2008 at 02:36:23PM +0200, Dag-Erling Sm??rgrav wrote: > > > Please note that BSD grep is not localized (and can't be per design) > > > and works only with standard C locale. It may not affect ports > > > system processing but shurely affects real texts handling. > > That is very tro

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Andrey Chernov
On Sun, Jun 15, 2008 at 09:17:01PM +0200, K?vesd?n G?bor wrote: > > Yes, of course, I haven't forgotten about your suggestion. First, I'd > like to process the trivial errors, which come up like this one and make > some tests myself. Then I'll think about this idea and ask portmgr to do > an exp-r