Re: ascii mode speed issues, --with-profiling

Glenn Maynard Sat, 27 Oct 2001 11:55:52 -0700

On Sat, Oct 27, 2001 at 02:56:07PM +0400, Alexander V. Lukyanov wrote:
> I've rewrote crlf_to_lf so it is more clear and easier to understand.
> 
> int crlf_to_lf(char *buf, int s)
> {
>    char *store=mem_crlf(buf,s);
>    if(!store)
>       return s;
> 
>    int retsize=s-1;
>    s-=store+1-buf;
>    buf=store+1;
> 
>    while(s>1)
>    {
>       char *crlf=mem_crlf(buf,s);
>       if(!crlf)
>        break;
> 
>       memmove(store, buf, crlf-buf);
>       store+=crlf-buf;
>       retsize--;
>       s-=crlf+1-buf;
>       buf=crlf+1;
>    }
>    memmove(store,buf,s);
>    return retsize;
> }


Both of ours are easier to understand for each of us, only because we
wrote them.  I can't quickly make sense of that, either. :)

Originally, I used memmove/memchr for speed--they're well-optimized.  In
retrospect, that's also iterating over the input buffer twice (though in
a cache-friendly way.)

Here's one that anyone can understand:

int crlf_to_lf(char *buf, int s)
{
        char *store = buf;
        int retsiz = s;
        while(s) {
                if(s-- > 1 && buf[0] == '\r' && buf[1] == '\n') {
                        buf++;
                        retsiz--;
                        continue;
                }
                *store++ = *buf++;
        }
        return retsiz;
}

It's about twice as fast when the buffer is a lot of short lines--which is
the typical case of NLIST.  The memchr/memmove approach is about 30% faster
for longer strings (ie. 80 characters/line, which would be the case if
it was used for LIST); apparently the word-reading optimizations used
by memchr and memmove offset the penalty of iterating twice.

At 64 bytes/line for 10 megs of data, on my system, it's still 300ms.
The speed difference doesn't really matter.

-- 
Glenn Maynard

Re: ascii mode speed issues, --with-profiling

Reply via email to