[EMAIL PROTECTED] writes: > Here is the cache hit case including your strlen+memcpy as 'LENCPY':
> $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected > to be very slow."' -DN=1 -o x x.c y.c strlcpy.c ; ./x > NONE: 696157 us > MEMCPY: 825118 us > STRNCPY: 7983159 us > STRLCPY: 10787462 us > LENCPY: 6048339 us It appears that these results are a bit platform-dependent; on my x86_64 (Xeon) Fedora 5 box, I get $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN=1 x.c y.c strlcpy.c $ ./a.out NONE: 358679 us MEMCPY: 619255 us STRNCPY: 8932551 us STRLCPY: 9212371 us LENCPY: 13910413 us I'm not sure why the lencpy method sucks so badly on this machine :-(. Anyway, I looked at glibc's strncpy and determined that on this machine the only real optimization that's been done to it is to unroll the data copying loop four times. I did the same to strlcpy (attached) and got numbers like these: $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN=1 x.c y.c strlcpy.c $ ./a.out NONE: 359317 us MEMCPY: 619636 us STRNCPY: 8933507 us STRLCPY: 7644576 us LENCPY: 13917927 us $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN="(1024*1024)" x.c y.c strlcpy.c $ ./a.out NONE: 502960 us MEMCPY: 5382528 us STRNCPY: 9733890 us STRLCPY: 8740892 us LENCPY: 15358616 us $ gcc -O3 -std=c99 -DSTRING='"short"' -DN=1 x.c y.c strlcpy.c $ ./a.out NONE: 358426 us MEMCPY: 618533 us STRNCPY: 6704926 us STRLCPY: 867336 us LENCPY: 10115883 us $ gcc -O3 -std=c99 -DSTRING='"short"' -DN="(1024*1024)" x.c y.c strlcpy.c $ ./a.out NONE: 502746 us MEMCPY: 5365171 us STRNCPY: 7983610 us STRLCPY: 5557277 us LENCPY: 11533066 us So the unroll seems to get us to the point of not losing compared to the original strncpy code for any string length, and so I propose doing that, if it holds up on other architectures. regards, tom lane size_t strlcpy(char *dst, const char *src, size_t siz) { char *d = dst; const char *s = src; size_t n = siz; /* Copy as many bytes as will fit */ if (n != 0) { while (n > 4) { if ((*d++ = *s++) == '\0') goto done; if ((*d++ = *s++) == '\0') goto done; if ((*d++ = *s++) == '\0') goto done; if ((*d++ = *s++) == '\0') goto done; n -= 4; } while (--n != 0) { if ((*d++ = *s++) == '\0') goto done; } } /* Not enough room in dst, add NUL and traverse rest of src */ if (siz != 0) *d = '\0'; /* NUL-terminate dst */ while (*s++) ; done: return(s - src - 1); /* count does not include NUL */ } ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster