apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 10:17 AM, Jim Jagielski wrote: > What is the current status? Is this on hold? > It is looking for a good name. I'm happy with apr_token_strcasecmp to best indicate its use-case and provenance. Does that work for everyone? It is looking for clearer docs. Spent 20 hours

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Mikhail T.
On 25.11.2015 12:42, William A Rowe Jr wrote: > If the script switches setlocale to turkish, for example, our > forced-lowercase content-type conversion > will cause "IMAGE/GIF" to become "ımage/gıf", clearly not what the > specs intended. I'm sorry, could you elaborate on this? Would not strtolow

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Nov 25, 2015 12:00, "Mikhail T." wrote: > > On 25.11.2015 12:42, William A Rowe Jr wrote: >> >> If the script switches setlocale to turkish, for example, our forced-lowercase content-type conversion >> will cause "IMAGE/GIF" to become "ımage/gıf", clearly not what the specs intended. > > I'm so

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Mikhail T.
On 25.11.2015 13:16, William A Rowe Jr wrote: > > Two variables, LC_CTYPE and LC_COLLATE control this text processing > behavior. The above is the correct lower case transliteration for > Turkish. In German, the upper case correspondence of sharp-S ß is > 'SS', but multi-char translation is not p

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jim Jagielski
> On Nov 25, 2015, at 12:42 PM, William A Rowe Jr wrote: > > On Wed, Nov 25, 2015 at 10:17 AM, Jim Jagielski wrote: > What is the current status? Is this on hold? > > It is looking for a good name. I'm happy with apr_token_strcasecmp > to best indicate its use-case and provenance. Does that

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 1:12 PM, Jim Jagielski wrote: > > > On Nov 25, 2015, at 12:42 PM, William A Rowe Jr > wrote: > > > > On Wed, Nov 25, 2015 at 10:17 AM, Jim Jagielski wrote: > > What is the current status? Is this on hold? > > > > It is looking for a good name. I'm happy with apr_token_s

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jim Jagielski
My point is that we use it to compare, for example, "FoobARski!" with "foOBArsKi!", not "Ébana?" with "ébana?" or "ebana?" In that way I mean "ascii" Heck, we may as well say that we really aren't comparing "strings" at all, just arrays of 8bit characters :) Anyway, that was my final post about

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jacob Champion
My two cents: I agree that another "name mangled" abbreviation is not particularly helpful, but I also agree with Jim's concern: "apr_token" made me immediately wonder what made this exclusive to HTTP tokens. Unfortunately I don't have much of an alternative suggestion. I have seen other frameworks

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 1:50 PM, Jim Jagielski wrote: > My point is that we use it to compare, for example, > "FoobARski!" with "foOBArsKi!", not "Ébana?" with "ébana?" or "ebana?" > > In that way I mean "ascii" > But that isn't precisely what you wrote. It happens to be ASCII here because we a

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 2:06 PM, Jacob Champion wrote: > My two cents: I agree that another "name mangled" abbreviation is not > particularly helpful, but I also agree with Jim's concern: "apr_token" made > me immediately wonder what made this exclusive to HTTP tokens. > Unfortunately I don't hav

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jim Jagielski
In general, strcmp() is not implemented via strcmp.c (although if you do a source code search for strcmp, that's what you'll get). Most of the time it's implemented in assembly (strcmp.s) or simply leverages memcmp() where you aren't doing a byte by byte comparison but are doing a native memory wor

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jim Jagielski
In a library that has: apr_pstrdup() apr_pstrndup() apr_pstrmemdup() and apr_pstrmemdup() and apr_pstrndup() are functionally the same, as well as: apr_strnatcasecmp() apr_strnatcmp() neither of which use an 'n' variable to determine string size, yet is c

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Jacob Champion
On Nov 25, 2015 1:10 PM, "Jim Jagielski" wrote: > ... I think we are WAY overthinking naming here. I overthink naming constantly, so there's an excellent chance that you're absolutely correct! That said... your list only ended up convincing me that APR needs better naming conventions. ;-D (I do

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 3:10 PM, Jim Jagielski wrote: > In a library that has: > > apr_pstrdup() > apr_pstrndup() > apr_pstrmemdup() > which are all semantically and mechanically different... > and apr_pstrmemdup() and apr_pstrndup() are functionally > the same, Are y

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Christophe JAILLET
Hi, just in case off, gnome as a set of function g_ascii_... (see https://developer.gnome.org/glib/2.28/glib-String-Utility-Functions.html#g-ascii-strcasecmp) I'm also waiting for feedback about the naming convention, I'd like to get this into APR yesterday and start building on it, but it's

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 3:52 PM, Christophe JAILLET < christophe.jail...@wanadoo.fr> wrote: > Hi, > > just in case off, gnome as a set of function g_ascii_... > (see > https://developer.gnome.org/glib/2.28/glib-String-Utility-Functions.html#g-ascii-strcasecmp > ) Interesting, does anyone know of

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Christophe JAILLET
Le 25/11/2015 22:02, Jim Jagielski a écrit : In general, strcmp() is not implemented via strcmp.c (although if you do a source code search for strcmp, that's what you'll get). Most of the time it's implemented in assembly (strcmp.s) or simply leverages memcmp() where you aren't doing a byte by by

RE: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Bert Huijben
] Sent: woensdag 25 november 2015 22:55 To: httpd Subject: Re: apr_token_* conclusions (was: Better casecmpstr[n]?) On Wed, Nov 25, 2015 at 3:52 PM, Christophe JAILLET mailto:christophe.jail...@wanadoo.fr> > wrote: Hi, just in case off, gnome as a set of function g_ascii_... (see

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Mikhail T.
On 25.11.2015 14:10, Mikhail T. wrote: >> >> Two variables, LC_CTYPE and LC_COLLATE control this text processing >> behavior. The above is the correct lower case transliteration for >> Turkish. In German, the upper case correspondence of sharp-S ß is >> 'SS', but multi-char translation is not pro

RE: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Bert Huijben
alphabet of ASCII characters. Bert From: Mikhail T. [mailto:mi+t...@aldan.algebra.com] Sent: woensdag 25 november 2015 23:19 To: dev@httpd.apache.org Subject: Re: apr_token_* conclusions (was: Better casecmpstr[n]?) On 25.11.2015 14:10, Mikhail T. wrote: Two variables

RE: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Bert Huijben
...@qqmail.nl] Sent: donderdag 26 november 2015 00:22 To: dev@httpd.apache.org Subject: RE: apr_token_* conclusions (was: Better casecmpstr[n]?) The example was the other way around. Changing SS to ß is not a valid transform, but the other way is. There are also transforms on the combined AE characters

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread Mikhail T.
On 25.11.2015 18:21, Bert Huijben wrote: > That Turkish ‘I’ problem is the only case I know of where the > collation actually changes behavior within the usual western alphabet > of ASCII characters. Argh, yes, I see now, what the problem would be... Thank you, -mi

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Nov 25, 2015 4:19 PM, "Mikhail T." wrote: > > >> >> So, the concern is, some hypothetical header, such as X-ASSIGN-TO may, after going through the locale-aware strtolower() unexpectedly become x-aßign-to? > > I just tested the above on both FreeBSD and Linux, and the results are encouraging: >>

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 6:45 PM, William A Rowe Jr wrote: > On Nov 25, 2015 4:19 PM, "Mikhail T." wrote: > > > > Thus, I contend, using C-library will not cause invalid results, and the > only reason to have Apache's own implementation is performance, but not > correctness. > > Well almost but w

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-25 Thread William A Rowe Jr
On Wed, Nov 25, 2015 at 9:44 PM, William A Rowe Jr wrote: > LANG="ku_TR.iso88599"; >64 = @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ > ^ @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`ABCDEFGHİJKLMNOPQRSTUVWXYZ{|}~ > v @abcdefghıjklmnopqrstuvwxyz[\]^_`abcdefghijklmnopqrstuvw

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-26 Thread Jim Jagielski
ascii? ascii? ascii? :-) > On Nov 25, 2015, at 4:52 PM, Christophe JAILLET > wrote: > > Hi, > > just in case off, gnome as a set of function g_ascii_... > (see > https://developer.gnome.org/glib/2.28/glib-String-Utility-Functions.html#g-ascii-strcasecmp) > >> >> I'm also waiting for fe

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-26 Thread Jim Jagielski
already had these in the time we still had ebcdic support on trunk. > (We removed that support years ago, but the code should still live on a > branch) > > Bert > > From: William A Rowe Jr [mailto:wr...@rowe-clan.net] > Sent: woensdag 25 november 2015 22:55 > To: ht

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-26 Thread William A Rowe Jr
uld still live on a > branch) > > > > Bert > > > > From: William A Rowe Jr [mailto:wr...@rowe-clan.net] > > Sent: woensdag 25 november 2015 22:55 > > To: httpd > > Subject: Re: apr_token_* conclusions (was: Better casecmpstr[n]?) > > > > On Wed, No

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-30 Thread Yann Ylavic
Sorry for the late, was afk this times... Regarding the name, I'm fine with ap[r]_cstr[n]casecmp(), ap[r]_casecmpcstr[n]() or ap[r]_cstr_*() (if we need a set of functions in this area).. I think we all agree that the new function(s) would help protocol "validation" being agnostic wrt the locale,

Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

2015-11-30 Thread William A Rowe Jr
I've hijacked Yann's thoughts and replied on a dev@apr thread. There is merit in httpd's deliberations but the issue is sufficiently larger than 'just us ourselves'.