Re: strcase?

2011-05-16 Thread Bruno Haible
Simon Josefsson wrote:
> What seems to be missing here is a clarification of what it means for a
> platform to be "modern" -- I recall we discussed this before, but I
> cannot find that anything ended up in the manual.
> 
> I think we need to name some platforms that we consider "modern" and
> some platforms that we consider obsolete for the "obsolete" keyword to
> be well defined and understandable by maintainers.

I've now reworked my proposal from January and committed a doc section
"Target platforms".

It gives no definite answer, because there are various degrees of support.
But at some point there's a consensus that platform XY is obsolete.

Bruno
-- 
In memoriam The victims of the Zaklopača massacre 




Re: strcase?

2011-05-16 Thread Bruno Haible
Eric Blake wrote:
> > A few modules use the status "obsolete" in the meaning of "deprecated",
> > however. Maybe it will help to avoid the confusion if we fix these. Here
> > is a proposed patch.
> 
> Looks reasonable to me.

Thanks for the review. Applied and pushed.

> "obsolete" means you need it only if targetting 
> otherwise-obsolete platforms with no replacement needed on modern
> platforms, "deprecated" means you should be prepared for the module to
> disappear by switching to its modern counterpart module.

This is a good explanation; you can put it into the documentation if the
question comes up again.

Bruno
-- 
In memoriam The victims of the Zaklopača massacre 




Re: strcase?

2011-05-09 Thread Simon Josefsson
Eric Blake  writes:

> On 05/09/2011 06:54 AM, Bruno Haible wrote:
>> Hi Simon,
>> 
>>> I noticed the 'strcase' module is marked as obsolete.  ... What
>>> should be used instead?
>> 
>> You are confusing "obsolete" with "deprecated". As explained in [1][2], the
>> meaning of "obsolete" in gnulib is that you don't need it, *not* that it will
>> go away.

Thanks -- I forgot this aspect.  I don't have a strong opinion on your
patch.

>> [1] http://www.gnu.org/software/gnulib/manual/html_node/Obsolete-modules.html
>> [2] http://lists.gnu.org/archive/html/bug-gnulib/2011-01/msg00541.html
>> 
>> A few modules use the status "obsolete" in the meaning of "deprecated",
>> however. Maybe it will help to avoid the confusion if we fix these. Here
>> is a proposed patch.
>
> Looks reasonable to me.  "obsolete" means you need it only if targetting
> otherwise-obsolete platforms with no replacement needed on modern
> platforms, "deprecated" means you should be prepared for the module to
> disappear by switching to its modern counterpart module.

What seems to be missing here is a clarification of what it means for a
platform to be "modern" -- I recall we discussed this before, but I
cannot find that anything ended up in the manual.

I think we need to name some platforms that we consider "modern" and
some platforms that we consider obsolete for the "obsolete" keyword to
be well defined and understandable by maintainers.

For lack of a better place, I think the list could go into the "Obsolete
modules" section.

/Simon



Re: strcase?

2011-05-09 Thread Eric Blake
On 05/09/2011 06:54 AM, Bruno Haible wrote:
> Hi Simon,
> 
>> I noticed the 'strcase' module is marked as obsolete.  ... What
>> should be used instead?
> 
> You are confusing "obsolete" with "deprecated". As explained in [1][2], the
> meaning of "obsolete" in gnulib is that you don't need it, *not* that it will
> go away.
> 
> [1] http://www.gnu.org/software/gnulib/manual/html_node/Obsolete-modules.html
> [2] http://lists.gnu.org/archive/html/bug-gnulib/2011-01/msg00541.html
> 
> A few modules use the status "obsolete" in the meaning of "deprecated",
> however. Maybe it will help to avoid the confusion if we fix these. Here
> is a proposed patch.

Looks reasonable to me.  "obsolete" means you need it only if targetting
otherwise-obsolete platforms with no replacement needed on modern
platforms, "deprecated" means you should be prepared for the module to
disappear by switching to its modern counterpart module.

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: strcase?

2011-05-09 Thread Bruno Haible
Hi Simon,

> I noticed the 'strcase' module is marked as obsolete.  ... What
> should be used instead?

You are confusing "obsolete" with "deprecated". As explained in [1][2], the
meaning of "obsolete" in gnulib is that you don't need it, *not* that it will
go away.

[1] http://www.gnu.org/software/gnulib/manual/html_node/Obsolete-modules.html
[2] http://lists.gnu.org/archive/html/bug-gnulib/2011-01/msg00541.html

A few modules use the status "obsolete" in the meaning of "deprecated",
however. Maybe it will help to avoid the confusion if we fix these. Here
is a proposed patch.


2011-05-09  Bruno Haible  

Fix confusion regarding deprecated modules.
* modules/calloc (Status, Notice): Mark module as deprecated, not
obsolete.
* modules/fnmatch-posix (Status, Notice): Likewise.
* modules/getdate (Status, Notice): Likewise.
* modules/getopt (Status, Notice): Likewise.
* modules/malloc (Status, Notice): Likewise.
* modules/pipe (Status, Notice): Likewise.
* modules/realloc (Status, Notice): Likewise.
* modules/rename-dest-slash (Status, Notice): Likewise.
* modules/unictype/bidicategory-all (Status, Notice): Likewise.
* modules/unictype/bidicategory-byname (Status, Notice): Likewise.
* modules/unictype/bidicategory-name (Status, Notice): Likewise.
* modules/unictype/bidicategory-of (Status, Notice): Likewise.
* modules/unictype/bidicategory-test (Status, Notice): Likewise.

--- modules/calloc.orig Mon May  9 14:49:19 2011
+++ modules/calloc  Mon May  9 14:48:57 2011
@@ -2,10 +2,10 @@
 calloc() function that is glibc compatible.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'calloc-gnu' instead.
+This module is deprecated. Use the module 'calloc-gnu' instead.
 
 Files:
 
--- modules/fnmatch-posix.orig  Mon May  9 14:49:19 2011
+++ modules/fnmatch-posix   Mon May  9 14:48:56 2011
@@ -2,10 +2,10 @@
 fnmatch() function: wildcard matching.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'fnmatch' instead.
+This module is deprecated. Use the module 'fnmatch' instead.
 
 Files:
 
--- modules/getdate.origMon May  9 14:49:19 2011
+++ modules/getdate Mon May  9 14:48:58 2011
@@ -2,10 +2,10 @@
 Convert a date/time string to linear time.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'parse-datetime' instead.
+This module is deprecated. Use the module 'parse-datetime' instead.
 
 Files:
 doc/getdate.texi
--- modules/getopt.orig Mon May  9 14:49:19 2011
+++ modules/getopt  Mon May  9 14:48:56 2011
@@ -2,10 +2,10 @@
 Process command line arguments.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'getopt-posix' or 'getopt-gnu' instead.
+This module is deprecated. Use the module 'getopt-posix' or 'getopt-gnu' 
instead.
 
 Files:
 
--- modules/malloc.orig Mon May  9 14:49:20 2011
+++ modules/malloc  Mon May  9 14:48:57 2011
@@ -7,10 +7,10 @@
 have side effects on the compilation of the main modules in lib/.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'malloc-gnu' instead.
+This module is deprecated. Use the module 'malloc-gnu' instead.
 
 Files:
 
--- modules/pipe.orig   Mon May  9 14:49:20 2011
+++ modules/pipeMon May  9 14:48:58 2011
@@ -2,10 +2,10 @@
 Creation of subprocesses, communicating via pipes.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'spawn-pipe' instead.
+This module is deprecated. Use the module 'spawn-pipe' instead.
 
 Files:
 lib/pipe.h
--- modules/realloc.origMon May  9 14:49:20 2011
+++ modules/realloc Mon May  9 14:48:56 2011
@@ -7,10 +7,10 @@
 have side effects on the compilation of the main modules in lib/.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'realloc-gnu' instead.
+This module is deprecated. Use the module 'realloc-gnu' instead.
 
 Files:
 
--- modules/rename-dest-slash.orig  Mon May  9 14:49:20 2011
+++ modules/rename-dest-slash   Mon May  9 14:49:05 2011
@@ -2,10 +2,10 @@
 rename() function: change the name or location of a file.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete; use the rename module instead.
+This module is deprecated. Use the 'rename' module instead.
 
 Files:
 
--- modules/unictype/bidicategory-all.orig  Mon May  9 14:49:20 2011
+++ modules/unictype/bidicategory-all   Mon May  9 14:48:57 2011
@@ -2,10 +2,10 @@
 Unicode character bidi category functions.
 
 Status:
-obsolete
+deprecated
 
 Notice:
-This module is obsolete. Use the module 'unictype/bidiclass-all' i

strcase?

2011-05-09 Thread Simon Josefsson
I noticed the 'strcase' module is marked as obsolete.  I don't see
anything in NEWS about that, doc/posix-functions/strcasecmp still refers
to 'strcase', and the 'strcase' module is used by some other modules
(argp, strptime).  Is the module really meant to be obsolete?  What
should be used instead?

/Simon



Re: problem in strcase

2008-01-04 Thread Bruno Haible
Simon Josefsson wrote:

> The strcase module fails to build.  It seems to need stuff from the
> string *.m4 file.  How about this patch?

Oops, my mistake from 2007-12-02. I'm committing this fix. Thanks
for the report.

2008-01-04  Bruno Haible  <[EMAIL PROTECTED]>

* m4/strcase.m4 (gl_FUNC_STRCASECMP, gl_FUNC_STRNCASECMP):
Require gl_HEADER_STRINGS_H_DEFAULTS, not
gl_HEADER_STRING_H_DEFAULTS.

diff --git a/m4/strcase.m4 b/m4/strcase.m4
index ae24215..79c525c 100644
--- a/m4/strcase.m4
+++ b/m4/strcase.m4
@@ -1,5 +1,5 @@
-# strcase.m4 serial 8
-dnl Copyright (C) 2002, 2005-2007 Free Software Foundation, Inc.
+# strcase.m4 serial 9
+dnl Copyright (C) 2002, 2005-2008 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
 dnl with or without modifications, as long as this notice is preserved.
@@ -12,7 +12,7 @@ AC_DEFUN([gl_STRCASE],
 
 AC_DEFUN([gl_FUNC_STRCASECMP],
 [
-  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  AC_REQUIRE([gl_HEADER_STRINGS_H_DEFAULTS])
   AC_REPLACE_FUNCS(strcasecmp)
   if test $ac_cv_func_strcasecmp = no; then
 HAVE_STRCASECMP=0
@@ -22,7 +22,7 @@ AC_DEFUN([gl_FUNC_STRCASECMP],
 
 AC_DEFUN([gl_FUNC_STRNCASECMP],
 [
-  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  AC_REQUIRE([gl_HEADER_STRINGS_H_DEFAULTS])
   AC_REPLACE_FUNCS(strncasecmp)
   if test $ac_cv_func_strncasecmp = no; then
 gl_PREREQ_STRNCASECMP



gnulib-patch3
Description: Binary data


problem in strcase

2008-01-04 Thread Simon Josefsson
The strcase module fails to build.  It seems to need stuff from the
string *.m4 file.  How about this patch?

2008-01-04  Simon Josefsson  <[EMAIL PROTECTED]>

* modules/strcase (Depends-on): Add string, because strcase needs
gl_HEADER_STRING_H_DEFAULTS.

diff --git a/modules/strcase b/modules/strcase
index 0023f46..01f3e5c 100644
--- a/modules/strcase
+++ b/modules/strcase
@@ -7,6 +7,7 @@ lib/strncasecmp.c
 m4/strcase.m4
 
 Depends-on:
+string
 strings
 
 configure.ac:

/Simon

[EMAIL PROTECTED]:~/t$ ~/gnulib/gnulib-tool --test strcase
Module list with included dependencies:
  include_next
  link-warning
  strcase
  strings
File list:
  build-aux/link-warning.h
  lib/dummy.c
  lib/strcasecmp.c
  lib/strings.in.h
  lib/strncasecmp.c
  m4/gnulib-common.m4
  m4/include_next.m4
  m4/onceonly_2_57.m4
  m4/strcase.m4
  m4/strings_h.m4
executing aclocal -I glm4
configure.ac:97: warning: gl_HEADER_STRING_H_DEFAULTS is m4_require'd but not 
m4_defun'd
glm4/strcase.m4:13: gl_FUNC_STRCASECMP is expanded from...
glm4/strcase.m4:7: gl_STRCASE is expanded from...
configure.ac:27: gl_INIT is expanded from...
configure.ac:97: the top level
glm4/strcase.m4:23: gl_FUNC_STRNCASECMP is expanded from...
executing autoconf
configure.ac:97: warning: gl_HEADER_STRING_H_DEFAULTS is m4_require'd but not 
m4_defun'd
glm4/strcase.m4:13: gl_FUNC_STRCASECMP is expanded from...
glm4/strcase.m4:7: gl_STRCASE is expanded from...
configure.ac:27: gl_INIT is expanded from...
configure.ac:97: the top level
glm4/strcase.m4:23: gl_FUNC_STRNCASECMP is expanded from...
configure:3650: error: possibly undefined macro: gl_HEADER_STRING_H_DEFAULTS
  If this token and others are legitimate, please use m4_pattern_allow.
  See the Autoconf documentation.
[EMAIL PROTECTED]:~/t$ 




Re: new module mbscasecmp, reduce goal of module strcase

2007-02-04 Thread Bruno Haible
Addendum #3:

--- m4/strcase.m4   5 Feb 2007 02:09:22 -   1.9
+++ m4/strcase.m4   5 Feb 2007 02:46:45 -
@@ -1,4 +1,4 @@
-# strcase.m4 serial 7
+# strcase.m4 serial 8
 dnl Copyright (C) 2002, 2005-2007 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -35,7 +35,6 @@
 
 # Prerequisites of lib/strcasecmp.c.
 AC_DEFUN([gl_PREREQ_STRCASECMP], [
-  AC_REQUIRE([gl_FUNC_MBRTOWC])
   :
 ])
 
--- modules/strcase 5 Feb 2007 02:15:46 -   1.12
+++ modules/strcase 5 Feb 2007 02:46:45 -
@@ -5,10 +5,8 @@
 lib/strcasecmp.c
 lib/strncasecmp.c
 m4/strcase.m4
-m4/mbrtowc.m4
 
 Depends-on:
-mbuiter
 string
 
 configure.ac:





Re: new module mbscasecmp, reduce goal of module strcase

2007-02-04 Thread Bruno Haible
Addendum #2:

--- m4/string_h.m4  5 Feb 2007 02:15:46 -   1.10
+++ m4/string_h.m4  5 Feb 2007 02:25:52 -
@@ -32,6 +32,7 @@
   HAVE_DECL_MEMRCHR=1; AC_SUBST([HAVE_DECL_MEMRCHR])
   HAVE_STPCPY=1;   AC_SUBST([HAVE_STPCPY])
   HAVE_STPNCPY=1;  AC_SUBST([HAVE_STPNCPY])
+  HAVE_STRCASECMP=1;   AC_SUBST([HAVE_STRCASECMP])
   HAVE_DECL_STRNCASECMP=1; AC_SUBST([HAVE_DECL_STRNCASECMP])
   HAVE_STRCHRNUL=1;AC_SUBST([HAVE_STRCHRNUL])
   HAVE_DECL_STRDUP=1;  AC_SUBST([HAVE_DECL_STRDUP])
@@ -41,7 +42,6 @@
   HAVE_STRPBRK=1;  AC_SUBST([HAVE_STRPBRK])
   HAVE_STRSEP=1;   AC_SUBST([HAVE_STRSEP])
   HAVE_DECL_STRTOK_R=1;AC_SUBST([HAVE_DECL_STRTOK_R])
-  REPLACE_STRCASECMP=0;AC_SUBST([REPLACE_STRCASECMP])
   REPLACE_STRCASESTR=0;AC_SUBST([REPLACE_STRCASESTR])
 ])
 





Re: new module mbscasecmp, reduce goal of module strcase

2007-02-04 Thread Bruno Haible
Addendum to the last patch:

--- m4/strcase.m4   27 Jan 2007 14:43:17 -  1.8
+++ m4/strcase.m4   5 Feb 2007 02:04:43 -
@@ -1,4 +1,4 @@
-# strcase.m4 serial 6
+# strcase.m4 serial 7
 dnl Copyright (C) 2002, 2005-2007 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -13,11 +13,11 @@
 AC_DEFUN([gl_FUNC_STRCASECMP],
 [
   AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
-  dnl No known system has a strcasecmp() function that works correctly in
-  dnl multibyte locales. Therefore we use our version always.
-  AC_LIBOBJ(strcasecmp)
-  REPLACE_STRCASECMP=1
-  gl_PREREQ_STRCASECMP
+  AC_REPLACE_FUNCS(strcasecmp)
+  if test $ac_cv_func_strcasecmp = no; then
+HAVE_STRCASECMP=0
+gl_PREREQ_STRCASECMP
+  fi
 ])
 
 AC_DEFUN([gl_FUNC_STRNCASECMP],
--- modules/string  5 Feb 2007 01:57:07 -   1.8
+++ modules/string  5 Feb 2007 02:04:43 -
@@ -44,6 +43,7 @@
  -e 's|@''HAVE_DECL_MEMRCHR''@|$(HAVE_DECL_MEMRCHR)|g' \
  -e 's|@''HAVE_STPCPY''@|$(HAVE_STPCPY)|g' \
  -e 's|@''HAVE_STPNCPY''@|$(HAVE_STPNCPY)|g' \
+ -e 's|@''HAVE_STRCASECMP''@|$(HAVE_STRCASECMP)|g' \
  -e 's|@''HAVE_DECL_STRNCASECMP''@|$(HAVE_DECL_STRNCASECMP)|g' \
  -e 's|@''HAVE_STRCHRNUL''@|$(HAVE_STRCHRNUL)|g' \
  -e 's|@''HAVE_DECL_STRDUP''@|$(HAVE_DECL_STRDUP)|g' \
@@ -53,7 +53,6 @@
  -e 's|@''HAVE_STRPBRK''@|$(HAVE_STRPBRK)|g' \
  -e 's|@''HAVE_STRSEP''@|$(HAVE_STRSEP)|g' \
  -e 's|@''HAVE_DECL_STRTOK_R''@|$(HAVE_DECL_STRTOK_R)|g' \
- -e 's|@''REPLACE_STRCASECMP''@|$(REPLACE_STRCASECMP)|g' \
  -e 's|@''REPLACE_STRCASESTR''@|$(REPLACE_STRCASESTR)|g' \
  < $(srcdir)/string_.h; \
} > [EMAIL PROTECTED]





new module mbscasecmp, reduce goal of module strcase

2007-02-04 Thread Bruno Haible
This creates a module for the function mbscasecmp(), a variant of strcasecmp()
that works with multibyte strings.

The module strcase now NO LONGER takes care of providing an internalionalized
strcasecmp()!! It only provides a replacement for platforms which don't have
this function.

2007-02-04  Bruno Haible  <[EMAIL PROTECTED]>

New module mbscasecmp, reduced goal of strcasecmp.
* modules/mbscasecmp: New file.
* lib/mbscasecmp.c: New file, copied from lib/strcasecmp.c.
(mbscasecmp): Renamed from strcasecmp.
* lib/strcasecmp.c: Don't include mbuiter.h.
(strcasecmp): Remove support for multibyte locales.
* lib/string_.h (strcasecmp): Don`t rename. Declare only if missing.
Change the conditional link warning.
(mbscasecmp): New declaration.
* m4/mbscasecmp.m4: New file.
* m4/string_h.m4 (gl_STRING_MODULE_INDICATOR_DEFAULTS): Initialize
GNULIB_MBSCASECMP.
* modules/string (string.h): Also substitute GNULIB_MBSCASECMP.
* MODULES.html.sh (Internationalization functions): Add mbscasecmp.

== modules/mbscasecmp ==
Description:
mbscasecmp() function: case-insensitive string comparison.

Files:
lib/mbscasecmp.c
m4/mbscasecmp.m4
m4/mbrtowc.m4

Depends-on:
mbuiter
string

configure.ac:
gl_FUNC_MBSCASECMP
gl_STRING_MODULE_INDICATOR([mbscasecmp])

Makefile.am:
lib_SOURCES += mbscasecmp.c

Include:


License:
LGPL

Maintainer:
Bruno Haible

= m4/mbscasecmp.m4 =
# mbscasecmp.m4 serial 1
dnl Copyright (C) 2007 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation
dnl gives unlimited permission to copy and/or distribute it,
dnl with or without modifications, as long as this notice is preserved.

AC_DEFUN([gl_FUNC_MBSCASECMP],
[
  gl_PREREQ_MBSCASECMP
])

# Prerequisites of lib/mbscasecmp.c.
AC_DEFUN([gl_PREREQ_MBSCASECMP], [
  AC_REQUIRE([gl_FUNC_MBRTOWC])
  :
])

--- MODULES.html.sh 5 Feb 2007 01:36:34 -   1.183
+++ MODULES.html.sh 5 Feb 2007 01:52:10 -
@@ -2163,6 +2163,7 @@
   func_module mbschr
   func_module mbsrchr
   func_module mbsstr
+  func_module mbscasecmp
   func_module mbswidth
   func_module memcasecmp
   func_module memcoll
--- lib/mbscasecmp.c5 Feb 2007 01:40:45 -   1.1
+++ lib/mbscasecmp.c5 Feb 2007 01:52:10 -
@@ -31,13 +31,13 @@
 
 #define TOLOWER(Ch) (isupper (Ch) ? tolower (Ch) : (Ch))
 
-/* Compare strings S1 and S2, ignoring case, returning less than, equal to or
-   greater than zero if S1 is lexicographically less than, equal to or greater
-   than S2.
+/* Compare the character strings S1 and S2, ignoring case, returning less than,
+   equal to or greater than zero if S1 is lexicographically less than, equal to
+   or greater than S2.
Note: This function may, in multibyte locales, return 0 for strings of
different lengths!  */
 int
-strcasecmp (const char *s1, const char *s2)
+mbscasecmp (const char *s1, const char *s2)
 {
   if (s1 == s2)
 return 0;
--- lib/strcasecmp.c26 Jan 2007 22:16:55 -  1.13
+++ lib/strcasecmp.c5 Feb 2007 01:52:11 -
@@ -1,7 +1,5 @@
 /* Case-insensitive string comparison function.
Copyright (C) 1998-1999, 2005-2007 Free Software Foundation, Inc.
-   Written by Bruno Haible <[EMAIL PROTECTED]>, 2005,
-   based on earlier glibc code.
 
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -25,79 +23,41 @@
 #include 
 #include 
 
-#if HAVE_MBRTOWC
-# include "mbuiter.h"
-#endif
-
 #define TOLOWER(Ch) (isupper (Ch) ? tolower (Ch) : (Ch))
 
 /* Compare strings S1 and S2, ignoring case, returning less than, equal to or
greater than zero if S1 is lexicographically less than, equal to or greater
than S2.
-   Note: This function may, in multibyte locales, return 0 for strings of
-   different lengths!  */
+   Note: This function does not work with multibyte strings!  */
+
 int
 strcasecmp (const char *s1, const char *s2)
 {
-  if (s1 == s2)
+  const unsigned char *p1 = (const unsigned char *) s1;
+  const unsigned char *p2 = (const unsigned char *) s2;
+  unsigned char c1, c2;
+
+  if (p1 == p2)
 return 0;
 
-  /* Be careful not to look at the entire extent of s1 or s2 until needed.
- This is useful because when two strings differ, the difference is
- most often already in the very few first characters.  */
-#if HAVE_MBRTOWC
-  if (MB_CUR_MAX > 1)
+  do
 {
-  mbui_iterator_t iter1;
-  mbui_iterator_t iter2;
+  c1 = TOLOWER (*p1);
+  c2 = TOLOWER (*p2);
 
-  mbui_init (iter1, s1);
-  mbui_init (iter2, s2);
+  if (c1 == '\0')
+   break;
 
-  while (mbui_avail (iter1) && mbui_av

Re: strstr, strcase, strcasestr, and i18n

2007-02-04 Thread Bruno Haible
Paul Eggert wrote:
> >   - strstr: This function's behaviour is not clearly defined. POSIX says
> > that it compares a "string" with a "sequence of bytes". Which a priori
> > is nonsense, since the elements of strings are characters.
> 
> No, elements of "character strings" are characters.  Elements of "strings"
> are bytes.  See:
> 
> http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_92
> http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_367

It's hard to know POSIX as well as you do :-)

> So strstr's behavior is clearly defined: it operates on strings (i.e.,
> byte strings), not character strings.

Indeed. And strstr cannot be specified to consider "character strings",
without breaking backward compatibility :-(

> > It was tempting to make a clear API nomenclature: c-str* for the C locale
> > emulation, str* for the internationalized functions. But if you're right
> > with strstr, then we should find new names for the internationalized 
> > versions
> > of these functions.
> 
> I think we have to find new names, yes.

Yup. It appears that Microsoft did their homework regarding str* functions
and multibyte strings, while the ISO C and POSIX communities didn't. I'll be
adding the following functions to gnulib, attempting to fix the hole that
ISO C and POSIX left.

  mbschr  like strchr
  mbsrchr like strrchr
  mbsstr  like strstr
  mbscasecmp  like strcasecmp
  mbscasestr  like strcasestr
  mbscspn like strcspn
  mbspbrk like strpbrk
  mbsspn  like strspn
  mbstok_rlike strtok_r

The prefix "mbs" coincides with the precedent "mbswidth" in gnulib and 
with the precedent "mbspbrk", "mbsrchr" on HP-UX.

It does not conflict with the Microsoft names, since Microsoft uses "_mbs",
but the functions have the same calling convention as Microsoft's functions,
except that MS uses 'unsigned char *' as multibyte string type.

Bruno





Re: strstr, strcase, strcasestr, and i18n

2007-02-02 Thread Paul Eggert
Bruno Haible <[EMAIL PROTECTED]> writes:

>   - strstr: This function's behaviour is not clearly defined. POSIX says
> that it compares a "string" with a "sequence of bytes". Which a priori
> is nonsense, since the elements of strings are characters.

No, elements of "character strings" are characters.  Elements of "strings"
are bytes.  See:

http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_92
http://www.opengroup.org/susv3/basedefs/xbd_chap03.html#tag_03_367

So strstr's behavior is clearly defined: it operates on strings (i.e.,
byte strings), not character strings.

>   - strcase (strcasecmp, strncasecmp): Here POSIX talks about two strings,
> but doesn't mention LC_CTYPE explicitly. Rather it says the results are
> "unspecified" in real locales.

This also makes sense, once one realizes that plain "string" means
byte string in POSIX.  These functions are well-defined on byte
strings only (and in the POSIX locale to boot).  The results are
unspecified if you use a multibyte locale.

> It was tempting to make a clear API nomenclature: c-str* for the C locale
> emulation, str* for the internationalized functions. But if you're right
> with strstr, then we should find new names for the internationalized versions
> of these functions.

I think we have to find new names, yes.




strstr, strcase, strcasestr, and i18n

2007-02-01 Thread Bruno Haible
I wrote:
> I think it's time for me to report a glibc bug on strstr and strcasestr, 
> then...

Paul Eggert wrote:
> But now that you mention it, why is there a c-strstr module, or a
> fancy strstr replacement that looks at multibyte characters?

The situation is indeed a bit messy.

Since , strtod, strtold are locale dependent, but sometimes
one needs the locale independent functionality, so we added c-ctype,
c-strtod, c-strtold.

It thought this could be extended to more str* functions easily, but the
situation is not so easy. The problematic modules are:

  - strstr: This function's behaviour is not clearly defined. POSIX says
that it compares a "string" with a "sequence of bytes". Which a priori
is nonsense, since the elements of strings are characters.

  - strcase (strcasecmp, strncasecmp): Here POSIX talks about two strings,
but doesn't mention LC_CTYPE explicitly. Rather it says the results are
"unspecified" in real locales. Also strncasecmp does not make sense for
multibyte locales.

  - strcasestr: This function is not specified by POSIX. All known legacy
implementations do not care about multibyte locales.

It was tempting to make a clear API nomenclature: c-str* for the C locale
emulation, str* for the internationalized functions. But if you're right
with strstr, then we should find new names for the internationalized versions
of these functions.

Bruno





Re: [bug-gnulib] Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Bruno Haible
Yoann Vandoorselaere wrote:
> > I don't think Chinese users will find it nice if you exclude them from
> > correct functioning of your program because of "performance" or "library 
> > size".
> 
> I don't think you are qualified to decide in place of the application
> developer whether the application should handle localized input or not.

Hehe, it's my role as gettext maintainer to encourage internationalization :-)

> I'm not advocating to not use them: I'm advocating to let the developer
> choose. Some of the library/program using GnuLib are used in embedded
> system where size matter, and where you won't see anything else than
> standard ASCII as input.

OK, embedded systems. What I can offer, as a compromise, is to introduce
flags like
   NO_CHINESE_USERS
   NO_JAPANESE_USERS
   NO_KOREAN_USERS
   NO_TURKISH_USERS
   UTF_8_ALL_THE_WAY
so that
  - when the first three are defined or the last one is defined, strstr uses
the byte-for-byte implementation,
  - when additionally NO_TURKISH_USERS is defined, strcasestr uses the
byte-for-byte implementation,
  - when UTF_8_ALL_THE_WAY is defined, iconv becomes a trivial nop.

With names chosen like this, the user of gnulib or of your software will
know explicitly which compromises he's making.

Would you be satisfied with that?

Bruno




Re: [bug-gnulib] Re: [bug-gnulib] Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Bruno Haible
Yoann Vandoorselaere wrote:
> "However, if we have a platform missing strcasestr, then using
> c_strcasestr as the substitute implementation is probably okay, because
> that platform would probably be broken in other areas, such as locale
> support, ...

Solaris 9 and Solaris 10 (which also doesn't have strcasestr) really
are not broken in the area of locale support. Solaris has good locale
support for years.

> such that a locale-aware replacement strcasestr would not be
> worth the effort."

It's not a question of effort. (The "effort" for you is to put an identifier
into the list of modules that you import from gnulib.) It's a question whether
your application can possibly be handling Chinese text from a Chinese user,
or whether it's handling ASCII English in all cases. In the first case,
you need strcasestr; in the latter case, you need c_strcasestr.

> Also, depending on the program or library using the fallback GnuLib
> module, you might not want a version of the function supporting locale,
> for performance reason or because of the number of dependencies it would
> bring, and the resulting library size. 

I don't think Chinese users will find it nice if you exclude them from
correct functioning of your program because of "performance" or "library size".

> As an example, on a recent Linux system, importing the strcasestr module
> generate a library more than twice the size of one importing
> c_strcasestr.

Yes. This is because even on a recent Linux system, the libc's strcasestr()
does not work right for Chinese strings (in GB18030 or BIG5 encoding for
example).

Bruno




Re: [bug-gnulib] Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Yoann Vandoorselaere
On Tue, 2006-11-14 at 13:38 +0100, Bruno Haible wrote:
> Yoann Vandoorselaere wrote:
> > "However, if we have a platform missing strcasestr, then using
> > c_strcasestr as the substitute implementation is probably okay, because
> > that platform would probably be broken in other areas, such as locale
> > support, ...
> 
> Solaris 9 and Solaris 10 (which also doesn't have strcasestr) really
> are not broken in the area of locale support. Solaris has good locale
> support for years.
> 
> > such that a locale-aware replacement strcasestr would not be
> > worth the effort."
> 
> It's not a question of effort. (The "effort" for you is to put an identifier
> into the list of modules that you import from gnulib.) It's a question whether
> your application can possibly be handling Chinese text from a Chinese user,
> or whether it's handling ASCII English in all cases. In the first case,
> you need strcasestr; in the latter case, you need c_strcasestr.

The point here would be to modify c_strcasestr so that it can serve as a
replacement module. :-)

> > Also, depending on the program or library using the fallback GnuLib
> > module, you might not want a version of the function supporting locale,
> > for performance reason or because of the number of dependencies it would
> > bring, and the resulting library size. 
> 
> I don't think Chinese users will find it nice if you exclude them from
> correct functioning of your program because of "performance" or "library 
> size".

I don't think you are qualified to decide in place of the application
developer whether the application should handle localized input or not.

I'm not advocating to not use them: I'm advocating to let the developer
choose. Some of the library/program using GnuLib are used in embedded
system where size matter, and where you won't see anything else than
standard ASCII as input.

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: [bug-gnulib] Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Yoann Vandoorselaere
On Tue, 2006-11-14 at 14:58 +0100, Bruno Haible wrote:
> Yoann Vandoorselaere wrote:
> > > I don't think Chinese users will find it nice if you exclude them from
> > > correct functioning of your program because of "performance" or "library 
> > > size".
> > 
> > I don't think you are qualified to decide in place of the application
> > developer whether the application should handle localized input or not.
> 
> Hehe, it's my role as gettext maintainer to encourage internationalization :-)
> 
> > I'm not advocating to not use them: I'm advocating to let the developer
> > choose. Some of the library/program using GnuLib are used in embedded
> > system where size matter, and where you won't see anything else than
> > standard ASCII as input.
> 
> OK, embedded systems. What I can offer, as a compromise, is to introduce
> flags like
>NO_CHINESE_USERS
>NO_JAPANESE_USERS
>NO_KOREAN_USERS
>NO_TURKISH_USERS
>UTF_8_ALL_THE_WAY
> so that
>   - when the first three are defined or the last one is defined, strstr uses
> the byte-for-byte implementation,
>   - when additionally NO_TURKISH_USERS is defined, strcasestr uses the
> byte-for-byte implementation,
>   - when UTF_8_ALL_THE_WAY is defined, iconv becomes a trivial nop.
> 
> With names chosen like this, the user of gnulib or of your software will
> know explicitly which compromises he's making.
> 
> Would you be satisfied with that?

The ability to disable localized input will certainly be useful to
certain specific project. However, I find the proposed flags name harsh.

Using "HANDLING" or "INPUT" in place of "USERS" sound more appropriate
to me, don't you think?

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: [bug-gnulib] Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Bruno Haible
Yoann Vandoorselaere wrote:
> Solaris 9 apparently lack the strcasestr() function.

If the program needs strcasestr(), then it needs the 'strcasestr' module.
It defines a replacement for strcasestr().

> Might we modify the 
> c-strcasestr module so that it provide a replacement for platform
> lacking the function now that one is identified?

The c-strcasestr module defines a variant of strcasestr that ignores the
user's locale. c_strcasestr() and strcasestr() are not equivalent.

Bruno




Re: [bug-gnulib] Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-14 Thread Yoann Vandoorselaere
On Tue, 2006-11-14 at 11:40 +0100, Bruno Haible wrote:
> Yoann Vandoorselaere wrote:
> > Solaris 9 apparently lack the strcasestr() function.
> 
> If the program needs strcasestr(), then it needs the 'strcasestr' module.
> It defines a replacement for strcasestr().
> 
> > Might we modify the 
> > c-strcasestr module so that it provide a replacement for platform
> > lacking the function now that one is identified?
> 
> The c-strcasestr module defines a variant of strcasestr that ignores the
> user's locale. c_strcasestr() and strcasestr() are not equivalent.

Quoting Eric Blake:

"However, if we have a platform missing strcasestr, then using
c_strcasestr as the substitute implementation is probably okay, because
that platform would probably be broken in other areas, such as locale
support, such that a locale-aware replacement strcasestr would not be
worth the effort."


Also, depending on the program or library using the fallback GnuLib
module, you might not want a version of the function supporting locale,
for performance reason or because of the number of dependencies it would
bring, and the resulting library size. 

As an example, on a recent Linux system, importing the strcasestr module
generate a library more than twice the size of one importing
c_strcasestr.

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-11-13 Thread Yoann Vandoorselaere
On Fri, 2006-09-15 at 05:35 -0600, Eric Blake wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> According to Yoann Vandoorselaere on 9/15/2006 5:29 AM:
> > Hi,
> > 
> > The c-ctype, c-strcase, c-strcasestr and c-strstr modules seem only to
> > implement their replacement functions using a "c_" prefix. 
> > 
> > However, there is no autoconf test implemented by these modules that
> > redefine the original function (in case it is missing) to point to their
> > GnuLib replacement. 
> > 
> > Is this behavior expected ?
> 
> Which platform is missing one of these?  Most implementations have a
> pretty full-featured  these days; and even on platforms like
> cygwin, where  is not POSIX compliant because it does not honor
> alternate locales, the functions still exist and work as if LC_ALL=C.  But
> if you identify which platform is missing which function, we can probably
> improve the module to serve as a replacement.

Solaris 9 apparently lack the strcasestr() function. Might we modify the
c-strcasestr module so that it provide a replacement for platform
lacking the function now that one is identified?

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: [bug-gnulib] Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-09-15 Thread Bruno Haible
Yoann Vandoorselaere wrote:
> The c-ctype, c-strcase, c-strcasestr and c-strstr modules seem only to
> implement their replacement functions using a "c_" prefix. 
> 
> However, there is no autoconf test implemented by these modules that
> redefine the original function (in case it is missing) to point to their
> GnuLib replacement. 
> 
> Is this behavior expected ?

Yes. These modules are not "replacement" functions; they define functions
of their own, not specified by POSIX.

c-ctype, c-strcase, c-strcasestr are useful if you explicitly don't
want locale-dependent behaviour (for example, if you know that the strings
are ASCII strings and you want treat 'i' and 'I' the same, even if
operating in a Turkish locale).

c-strstr is a speedup for strstr, applicable in certain cases (see
c-strstr.h for details).

Bruno




Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-09-15 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Yoann Vandoorselaere on 9/15/2006 5:40 AM:
> 
> I recall platform missing strcasestr, although I can't remember which.
> Anyway, what's the purpose of these modules if they are not used
> anywhere ?

The c_* modules ARE used, particularly in gettext, in situations where
locales do work and strcasestr honors locales at the expense of extra
processor utilization.  The point of Bruno's modules is that sometimes you
_want_ LC_ALL=C semantics, but without having to change the locale and
restore it afterwards.

However, if we have a platform missing strcasestr, then using c_strcasestr
as the substitute implementation is probably okay, because that platform
would probably be broken in other areas, such as locale support, such that
a locale-aware replacement strcasestr would not be worth the effort.

- --
Life is short - so eat dessert first!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCpO284KuGfSFAYARAvv5AKCAMJ6L7L72mpvHTFe5JAw1ARKyLQCeMSac
1T7boNzDv5uNBBntwyRLiGA=
=tdkU
-END PGP SIGNATURE-




Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-09-15 Thread Yoann Vandoorselaere
On Fri, 2006-09-15 at 05:35 -0600, Eric Blake wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> According to Yoann Vandoorselaere on 9/15/2006 5:29 AM:
> > Hi,
> > 
> > The c-ctype, c-strcase, c-strcasestr and c-strstr modules seem only to
> > implement their replacement functions using a "c_" prefix. 
> > 
> > However, there is no autoconf test implemented by these modules that
> > redefine the original function (in case it is missing) to point to their
> > GnuLib replacement. 
> > 
> > Is this behavior expected ?
> 
> Which platform is missing one of these?  Most implementations have a
> pretty full-featured  these days; and even on platforms like
> cygwin, where  is not POSIX compliant because it does not honor
> alternate locales, the functions still exist and work as if LC_ALL=C.  But
> if you identify which platform is missing which function, we can probably
> improve the module to serve as a replacement.

I recall platform missing strcasestr, although I can't remember which.
Anyway, what's the purpose of these modules if they are not used
anywhere ?

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-09-15 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Yoann Vandoorselaere on 9/15/2006 5:29 AM:
> Hi,
> 
> The c-ctype, c-strcase, c-strcasestr and c-strstr modules seem only to
> implement their replacement functions using a "c_" prefix. 
> 
> However, there is no autoconf test implemented by these modules that
> redefine the original function (in case it is missing) to point to their
> GnuLib replacement. 
> 
> Is this behavior expected ?

Which platform is missing one of these?  Most implementations have a
pretty full-featured  these days; and even on platforms like
cygwin, where  is not POSIX compliant because it does not honor
alternate locales, the functions still exist and work as if LC_ALL=C.  But
if you identify which platform is missing which function, we can probably
improve the module to serve as a replacement.

- --
Life is short - so eat dessert first!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCo/984KuGfSFAYARAv95AJ0cy9+c8PCt2q5MB2HLD/Pj2RxTXgCgyiUz
SR/fXIrxydlvtHDflkOS+rc=
=echf
-END PGP SIGNATURE-




Question concerning c-ctype, c-strcase, c-strcasestr and c-strstr modules

2006-09-15 Thread Yoann Vandoorselaere
Hi,

The c-ctype, c-strcase, c-strcasestr and c-strstr modules seem only to
implement their replacement functions using a "c_" prefix. 

However, there is no autoconf test implemented by these modules that
redefine the original function (in case it is missing) to point to their
GnuLib replacement. 

Is this behavior expected ?

Regards,

-- 
Yoann Vandoorselaere | Responsable R&D / CTO | PreludeIDS Technologies
Tel: +33 (0)8 70 70 21 58  Fax: +33(0)4 78 42 21 58
http://www.prelude-ids.com





Re: new module 'c-strcase'

2005-10-11 Thread Bruno Haible
Paul Eggert wrote:
> OK, I see now.

OK. In the absence of other objections, I committed both modules.

Bruno



___
bug-gnulib mailing list
bug-gnulib@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnulib


Re: new module 'c-strcase'

2005-10-11 Thread Paul Eggert
Bruno Haible <[EMAIL PROTECTED]> writes:

> The result is then well-defined but not related to the behaviour of
> the C locale on such systems, and the name of the module would be a
> misnomer :-)

OK, I see now.  I had understood "C" to stand for the traditional C
behavior and not the less-well-specified behavior of the
POSIX-specified C locale.

>> Such machines do exist.  They are unlikely targets for big GNU
>> apps but are potential targets for this module.
>
> just for info, what are these machines? The 10-year old CRAY ?

I was thinking more of digital signal processors.  For example, I
believe the Freescale (formely Motorola) compilers define CHAR_BIT to
be 32 for some of their CPUs.  Admittedly I'm not a DSP weenie so I
don't know how popular those systems actually are.  Perhaps we could
ask the GNU Radio folks.


___
bug-gnulib mailing list
bug-gnulib@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnulib


Re: new module 'c-strcase'

2005-10-11 Thread Bruno Haible
Paul Eggert wrote:
> > More precisely, one of the string arguments must be an ASCII string;
> > the other one can also contain non-ASCII characters (but then the
> > comparison result will be nonzero).
>
> Why is this restriction needed?

It is needed to guarantee that the result is equivalent to the comparison
result in the C locale. On a system where the C locale has UTF-8 encoding,

  c_strcasecmp ("François", "FRANÇOIS") != 0

although

  setlocale (LC_ALL, "C");
  strcasecmp ("François", "FRANÇOIS") == 0.

> Doesn't the code simply
> compare bytes after converting 'A'-'Z' to 'a'-'z'?  In that case,
> it is not really required that one argument must be an ASCII string;
> both strings can be non-ASCII but the result is still well-defined.

The result is then well-defined but not related to the behaviour of
the C locale on such systems, and the name of the module would be a
misnomer :-)

> >   return c1 - c2;
>
> A nit: in theory this could result in integer overflow.
> The following would be portable to machines where char == int.
>
>return UCHAR_MAX <= INT_MAX ? c1 - c2 : c1 < c2 ? -1 : c1 > c2;
>
> Such machines do exist.  They are unlikely targets for big GNU
> apps but are potential targets for this module.

OK, fixed. But just for info, what are these machines? The 10-year old
CRAY ?

Bruno


2005-10-11  Bruno Haible  <[EMAIL PROTECTED]>

* strcasecmp.c: Include limits.h.
(strcasecmp): Avoid integer overflow on exotic platforms.
* strncasecmp.c: Include limits.h.
(strncasecmp): Avoid integer overflow on exotic platforms.
Reported by Paul Eggert.

diff -c -3 -r1.10 strcasecmp.c
*** strcasecmp.c17 Aug 2005 14:01:07 -  1.10
--- strcasecmp.c11 Oct 2005 12:47:19 -
***
*** 25,30 
--- 25,31 
  #include "strcase.h"
  
  #include 
+ #include 
  
  #if HAVE_MBRTOWC
  # include "mbuiter.h"
***
*** 93,98 
}
while (c1 == c2);
  
!   return c1 - c2;
  }
  }
--- 94,105 
}
while (c1 == c2);
  
!   if (UCHAR_MAX <= INT_MAX)
!   return c1 - c2;
!   else
!   /* On machines where 'char' and 'int' are types of the same size, the
!  difference of two 'unsigned char' values - including the sign bit -
!  doesn't fit in an 'int'.  */
!   return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0);
  }
  }
diff -c -3 -r1.6 strncasecmp.c
*** strncasecmp.c   19 Sep 2005 17:28:15 -  1.6
--- strncasecmp.c   11 Oct 2005 12:47:19 -
***
*** 1,5 
  /* strncasecmp.c -- case insensitive string comparator
!Copyright (C) 1998, 1999 Free Software Foundation, Inc.
  
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
--- 1,5 
  /* strncasecmp.c -- case insensitive string comparator
!Copyright (C) 1998, 1999, 2005 Free Software Foundation, Inc.
  
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
***
*** 23,28 
--- 23,29 
  #include "strcase.h"
  
  #include 
+ #include 
  
  #define TOLOWER(Ch) (isupper (Ch) ? tolower (Ch) : (Ch))
  
***
*** 54,58 
  }
while (c1 == c2);
  
!   return c1 - c2;
  }
--- 55,65 
  }
while (c1 == c2);
  
!   if (UCHAR_MAX <= INT_MAX)
! return c1 - c2;
!   else
! /* On machines where 'char' and 'int' are types of the same size, the
!difference of two 'unsigned char' values - including the sign bit -
!doesn't fit in an 'int'.  */
! return (c1 > c2 ? 1 : c1 < c2 ? -1 : 0);
  }



___
bug-gnulib mailing list
bug-gnulib@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnulib


Re: new module 'c-strcase'

2005-10-10 Thread Paul Eggert
Bruno Haible <[EMAIL PROTECTED]> writes:

> More precisely, one of the string arguments must be an ASCII string;
> the other one can also contain non-ASCII characters (but then the
> comparison result will be nonzero).

Why is this restriction needed?  Doesn't the code simply
compare bytes after converting 'A'-'Z' to 'a'-'z'?  In that case,
it is not really required that one argument must be an ASCII string;
both strings can be non-ASCII but the result is still well-defined.

>   return c1 - c2;

A nit: in theory this could result in integer overflow.
The following would be portable to machines where char == int.

   return UCHAR_MAX <= INT_MAX ? c1 - c2 : c1 < c2 ? -1 : c1 > c2;

Such machines do exist.  They are unlikely targets for big GNU
apps but are potential targets for this module.


___
bug-gnulib mailing list
bug-gnulib@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnulib


new module 'c-strcase'

2005-10-10 Thread Bruno Haible
Hi,

While locale dependent string comparison and searching functions are
fine for many purposes, in some areas it is more important to have a locale
independent behaviour.

Here is a module for case-insensitive string comparison. Used in GNU gettext
for half a year.

Objections? Comments?


= modules/c-strcase =
Description:
Case-insensitive string comparison functions in C locale.

Files:
lib/c-strcase.h
lib/c-strcasecmp.c
lib/c-strncasecmp.c

Depends-on:
c-ctype

configure.ac:

Makefile.am:
lib_SOURCES += c-strcase.h c-strcasecmp.c c-strncasecmp.c

Include:
"c-strcase.h"

License:
LGPL

Maintainer:
Bruno Haible

=== lib/c-strcase.h ===
/* Case-insensitive string comparison functions in C locale.
   Copyright (C) 1995-1996, 2001, 2003, 2005 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2, or (at your option)
   any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */

#ifndef C_STRCASE_H
#define C_STRCASE_H

#include 


/* The functions defined in this file assume the "C" locale and a character
   set without diacritics (ASCII-US or EBCDIC-US or something like that).
   Even if the "C" locale on a particular system is an extension of the ASCII
   character set (like on BeOS, where it is UTF-8, or on AmigaOS, where it
   is ISO-8859-1), the functions in this file recognize only the ASCII
   characters.  More precisely, one of the string arguments must be an ASCII
   string; the other one can also contain non-ASCII characters (but then
   the comparison result will be nonzero).  */


#ifdef __cplusplus
extern "C" {
#endif


/* Compare strings S1 and S2, ignoring case, returning less than, equal to or
   greater than zero if S1 is lexicographically less than, equal to or greater
   than S2.  */
extern int c_strcasecmp (const char *s1, const char *s2);

/* Compare no more than N characters of strings S1 and S2, ignoring case,
   returning less than, equal to or greater than zero if S1 is
   lexicographically less than, equal to or greater than S2.  */
extern int c_strncasecmp (const char *s1, const char *s2, size_t n);


#ifdef __cplusplus
}
#endif


#endif /* C_STRCASE_H */
= lib/c-strcasecmp.c =
/* c-strcasecmp.c -- case insensitive string comparator in C locale
   Copyright (C) 1998, 1999, 2005 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2, or (at your option)
   any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */

#ifdef HAVE_CONFIG_H
# include 
#endif

/* Specification.  */
#include "c-strcase.h"

#include "c-ctype.h"

int
c_strcasecmp (const char *s1, const char *s2)
{
  register const unsigned char *p1 = (const unsigned char *) s1;
  register const unsigned char *p2 = (const unsigned char *) s2;
  unsigned char c1, c2;

  if (p1 == p2)
return 0;

  do
{
  c1 = c_tolower (*p1);
  c2 = c_tolower (*p2);

  if (c1 == '\0')
break;

  ++p1;
  ++p2;
}
  while (c1 == c2);

  return c1 - c2;
}
= lib/c-strncasecmp.c =
/* c-strncasecmp.c -- case insensitive string comparator in C locale
   Copyright (C) 1998, 1999, 2005 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2, or (at your option)
   any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU G