On some platforms, apr_isalpha(), apr_isspace(), apr_is* are expensive. On AIX, each call to apr_isspace() calls an internal AIX function named __is_wctype_std(). The "wctype" implies (I believe) "wide character". This is some sort of check to determine the charset of the string being scanned. This call is not particularly expensive, but the http server makes 100's of calls to apr_is* functions and they add up quickly, consuming several % of the cpu cycles on a typical request.
It is possible to give the isspace, isalpha, etc. routines a hint at compile time about the charset that will be passed to them. If the charset is 8859-1, a table lookup is done, saving one or perhaps two function calls and the processing inside __is_wctype_std(). The following patch places #define USE_C_LOCALE at the top of each file that deals with 8859-1 (the charset of URLs). This causes the _ILS_MACROS and _C_LOCALE_ONLY marcos to be set in apr_lib.h prior to including ctype.h, which results in the apr_is* functions using simple table lookups rather than expensive functions. This is a shotgun approach; if we #define USE_C_LOCALE at the top of a file, then ALL the apr_is* functions called from that file will expect only 8859-1 charsets (apr_is* functions called elsewhere still go through __is_wctype_std()) Another way of solving this problem is to craft a new set of apr_is* functions that are specific to parsing URLs in that they can be compiled to expect only 8859-1 charsets. A problem common to both of these approaches is that apr_is* functions are called from within other APR functions that have no knowledge of the charset being used. Looking for some ideas. Bill Index: modules/http/http_protocol.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/http/http_protocol.c,v retrieving revision 1.462 diff -u -u -r1.462 http_protocol.c --- modules/http/http_protocol.c 11 Oct 2002 15:29:20 -0000 1.462 +++ modules/http/http_protocol.c 29 Oct 2002 18:21:54 -0000 @@ -62,8 +62,8 @@ * Code originally by Rob McCool; much redone by Robert S. Thau * and the Apache Software Foundation. */ - +#define USE_C_LOCALE #include "apr.h" #include "apr_strings.h" #include "apr_buckets.h" Index: server/protocol.c =================================================================== RCS file: /home/cvs/httpd-2.0/server/protocol.c,v retrieving revision 1.119 diff -u -u -r1.119 protocol.c --- server/protocol.c 2 Oct 2002 13:41:45 -0000 1.119 +++ server/protocol.c 29 Oct 2002 18:21:54 -0000 @@ -62,7 +62,7 @@ * Code originally by Rob McCool; much redone by Robert S. Thau * and the Apache Software Foundation. */ - +#define USE_C_LOCALE #include "apr.h" #include "apr_strings.h" #include "apr_buckets.h" Index: server/util.c =================================================================== RCS file: /home/cvs/httpd-2.0/server/util.c,v retrieving revision 1.131 diff -u -u -r1.131 util.c --- server/util.c 14 Oct 2002 00:12:02 -0000 1.131 +++ server/util.c 29 Oct 2002 18:21:54 -0000 @@ -68,7 +68,8 @@ * #define DEBUG to trace all cfg_open*()/cfg_closefile() calls * #define DEBUG_CFG_LINES to trace every line read from the config files */ +#define USE_C_LOCALE #include "apr.h" #include "apr_strings.h" Index: server/vhost.c =================================================================== RCS file: /home/cvs/httpd-2.0/server/vhost.c,v retrieving revision 1.78 diff -u -u -r1.78 vhost.c --- server/vhost.c 2 Sep 2002 01:37:54 -0000 1.78 +++ server/vhost.c 29 Oct 2002 18:21:54 -0000 @@ -60,8 +60,9 @@ * http_vhost.c: functions pertaining to virtual host addresses * (configuration and run-time) */ - +#define USE_C_LOCALE + #include "apr.h" #include "apr_strings.h" #include "apr_lib.h" Index: srclib/apr/include/apr_lib.h =================================================================== RCS file: /home/cvs/apr/include/apr_lib.h,v retrieving revision 1.58 diff -u -u -r1.58 apr_lib.h --- srclib/apr/include/apr_lib.h 17 Jul 2002 02:53:25 -0000 1.58 +++ srclib/apr/include/apr_lib.h 29 Oct 2002 18:21:54 -0000 @@ -58,9 +58,16 @@ * @file apr_lib.h * @brief APR general purpose library routines */ - #include "apr.h" #include "apr_errno.h" + +/* Give the isslpha, isspace, et. al. hints if the charset is ISO-8859-1. Some implementations + * use simple table look-ups if the charset is ISO-8859-1 + */ +#ifdef USE_C_LOCALE +#define _ILS_MACROS +#define _C_LOCALE_ONLY +#endif #if APR_HAVE_CTYPE_H #include <ctype.h>