on.
>
> Just what I was thinking. Attached is the same program with one more pair
> of functions added (and an easy way to add more "candidates" to the
> main-driver). I changed the FOR-loop define to obtain repeatable results:
>
> # Test 1 -- equal strings:
> foreach m
On Tue, Nov 24, 2015 at 10:07 AM, Mikhail T.
wrote:
> On 24.11.2015 10:08, William A Rowe Jr wrote:
>
> As long as this function is promoted for fast ASCII-specific token
> recognition and has no other unexpected equalities, it serves a useful
> purpose.
>
> Because of this, I'd suggest renaming
On 24.11.2015 10:08, William A Rowe Jr wrote:
> As long as this function is promoted for fast ASCII-specific token
> recognition and has no other unexpected equalities, it serves a useful
> purpose.
Because of this, I'd suggest renaming it to something, that emphasizes
it being ASCII-only.
-mi
On Tue, Nov 24, 2015 at 6:40 AM, Jim Jagielski wrote:
> It really depends on the OS and the version of the OS. In
> my test cases on OSX and CentOS5 and centOS6, I see
> measurable improvements.
>
Part of the reason for your differences... on this console here I have;
$ set | grep -E "^L[AC]"
LA
Jagielski wrote:
>>>> We use str[n]casecmp quite a bit. The rub is that some
>>>> platforms use a sensible implementation (such as OSX) which
>>>> uses a upper-lowercase map and is v. fast, and others
>>>> use a brain-dead version which does an actual
On Tue, Nov 24, 2015 at 6:10 AM, Mikhail T. wrote:
>
> Attached is the same program with one more pair of
> functions added (and an easy way to add more "candidates" to the
> main-driver). I changed the FOR-loop define to obtain repeatable results:
This test program kills str[n]casecmp()'s inlini
On Tue, Nov 24, 2015 at 4:12 AM, Mikhail T. wrote:
> On 23.11.2015 19:43, Yann Ylavic wrote:
>
>> That's expected (or at least no cared about in this test case). We simply
>> want res to not be optimized out, so print it before leaving, without any
>> particular relevance for its value (string.h a
R-loop define to obtain repeatable results:
# Test 1 -- equal strings:
foreach m ( 0 1 2 )
foreach? ./strncasecmp $m 1 a A 7
foreach? end
string.h (nb=1, len=7)
time = 6.975845 : res = 0
optimized (nb=1, len=7)
time = 1.492
On Nov 23, 2015 21:12, "Mikhail T." wrote:
>
> On 23.11.2015 19:43, Yann Ylavic wrote:
>>
>> No measured difference in my tests, I guess it depends on likelyhood to
fail/succeed early in the string or not.
>
> ? I don't see, where it wins anything -- but I do see, where it loses a
little...
>
>> T
On 23.11.2015 19:43, Yann Ylavic wrote:
> No measured difference in my tests, I guess it depends on likelyhood to
> fail/succeed early in the string or not.
? I don't see, where it wins anything -- but I do see, where it loses a
little...
> That's expected (or at least no cared about in this test
On Mon, Nov 23, 2015 at 11:42 PM, Yann Ylavic wrote:
> except -Os I always have better
> results with the "optimized" version
To reach better performances with -Os, we could possibly use:
int ap_strcasecmp(const char *s1, const char *s2)
{
const unsigned char *ps1 = (const unsigned char *) s
== '\0') {
break;
}
}
return (0);
}
#define PROG argv[0]
#define METHOD argv[1]
#define NB argv[2]
#define S1 argv[3]
#define S2 argv[4]
#define LEN argv[5]
/* The ++ are here to try to prevent some optimization done by gcc */
#define FOR for (i=0; i S1 S2
On Tue, Nov 24, 2015 at 1:24 AM, Mikhail T. wrote:
>
> Is there really a gain in inc- and decrementing this way? Would not it be
> easier to read with the explicit increments -- and, incidentally, no
> decrements at all?
No measured difference in my tests, I guess it depends on likelyhood
to fail
esult (for either of the
methods) can depend on the number of iterations (!!):
./strncasecmp 1 27 aCaa Ac 2
Optimized (nb=27, len=2)
time = 0.01 : res = 32
./strncasecmp 1 26 aCaa Ac 2
Optimized (nb=26, len=2)
time = 0.01 : res = 0
./strncasecmp 0 27 aCaa Ac 2
On Tue, Nov 24, 2015 at 1:07 AM, Mikhail T. wrote:
>
> BTW, if the program measures its own time, should it not use getrusage()
> instead of gettimeofday()?
Well, it measures the time spent in the relevant code, with a
monotonic clock, that should be fair enough.
We don't care about the whole pro
On 23.11.2015 19:05, Yann Ylavic wrote:
> Here is the correct (new) test, along with the diff wrt the original
> (Christophe's) test.c.
BTW, if the program measures its own time, should it not use getrusage()
instead of gettimeofday()?
-mi
if (*METHOD == '0') {
printf(" (nb=%d, len=%d)\n", nb, len);
+gettimeofday(&tvs, NULL);
if (len == 0) {
FOR {
/* really use the result of the function */
@@ -107,9 +112,11 @@ int main(int argc, char *argv[])
On 23.11.2015 17:43, Yann Ylavic wrote:
> with attachment...
There is a mistake somewhere in the optimized version:
./o 1 1 aa1a 0
Optimized (nb=1, len=0)
time = 0.611311 : res = 0
The result should not be zero. Indeed, the string.h version is correct:
./o 0
Please note that the changes in ap_str[n]casecmp(), ie:
++ps1;
++ps2;
was a first try/change which (obviously) did nothing.
You may ignore it.
On Mon, Nov 23, 2015 at 11:43 PM, Yann Ylavic wrote:
> with attachment...
>
> On Mon, Nov 23, 2015 at 11:42 PM, Yann Ylavic wrote:
>> I
ar *) s2;
while (ucharmap[*ps1] == ucharmap[*ps2]) {
if (*ps1 == '\0') {
return (0);
}
++ps1;
++ps2;
}
return (ucharmap[*ps1] - ucharmap[*--ps2]);
}
int ap_strncasecmp(const char *s1, const char *s2, int n)
{
const unsigned char *p
I modified your test program a bit (to measure time from it, see
attached), tried with -O{2,3,s}, and except -Os I always have better
results with the "optimized" version, eg:
$ ./a-O3.out 0 15000 xcxcxcxcxcxcxcxcxcxcwwaa
xcxcxcxcxcxcxcxcxcxcwwaa 0
(nb=1500
armap[*--ps2]);
}
if (*ps1++ == '\0') {
break;
}
}
return (0);
}
#define PROG argv[0]
#define METHOD argv[1]
#define NB argv[2]
#define S1 argv[3]
#define S2 argv[4]
#define LEN argv[5]
/* The ++ are here to try to prevent some optim
Hi Christophe,
On Mon, Nov 23, 2015 at 9:12 PM, Christophe JAILLET
wrote:
>
> I tried to do some but the benefit of the optimized version is not that
> clear, at least on my system:
>gcc 5.2.1
>Linux linux 4.2.0-18-generic #22-Ubuntu SMP Fri Nov 6 18:25:50 UTC 2015
> x86_64 x86_64 x86_64
-lowercase map and is v. fast, and others
use a brain-dead version which does an actual tolower() of
each char in the string as it tests. We actually try to
handle this in many cases by doing a switch/case test on the
1st char to fast path the strncasecmp, resulting in ugly code.
This is crazy.
I
uses a upper-lowercase map and is v. fast, and others
>> use a brain-dead version which does an actual tolower() of
>> each char in the string as it tests. We actually try to
>> handle this in many cases by doing a switch/case test on the
>> 1st char to fast path the strncase
Le 20/11/2015 18:17, Jim Jagielski a écrit :
Ideally, it would be in apr
+1
This could also be even more interesting, because of apr_table_ functions.
CJ
e this in many cases by doing a switch/case test on the
>> 1st char to fast path the strncasecmp, resulting in ugly code.
>>
>> This is crazy.
>>
>> I propose a ap_strncasecmp/ap_strcasecmp which we should use.
>> Ideally, it would be in apr but no need to wait f
Pay special attention to;
The *strncasecmp*() function shall compare, *while ignoring differences in
case*, not more than *n* bytes from the string pointed to by *s1* to the
string pointed to by *s2*.
In the POSIX locale, *strcasecmp*() and *strncasecmp*() shall *behave as if
the strings had
s v. fast, and others
> use a brain-dead version which does an actual tolower() of
> each char in the string as it tests. We actually try to
> handle this in many cases by doing a switch/case test on the
> 1st char to fast path the strncasecmp, resulting in ugly code.
>
> This is cr
tolower() of
> each char in the string as it tests. We actually try to
> handle this in many cases by doing a switch/case test on the
> 1st char to fast path the strncasecmp, resulting in ugly code.
>
> This is crazy.
>
> I propose a ap_strncasecmp/ap_strcasecmp which we should u
string as it tests. We actually try to
handle this in many cases by doing a switch/case test on the
1st char to fast path the strncasecmp, resulting in ugly code.
This is crazy.
I propose a ap_strncasecmp/ap_strcasecmp which we should use.
Ideally, it would be in apr but no need to wait for that
to
many cases by doing a switch/case test on the
1st char to fast path the strncasecmp, resulting in ugly code.
This is crazy.
I propose a ap_strncasecmp/ap_strcasecmp which we should use.
Ideally, it would be in apr but no need to wait for that
to happen :)
Unless people have heartburn about this
32 matches
Mail list logo