Re: Slower than Python

Steven Schveighoffer Mon, 04 Mar 2013 07:40:38 -0800

On Sat, 02 Mar 2013 17:29:38 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:

On 3/2/13 2:32 PM, Steven Schveighoffer wrote:
This is not a personal attack, please don't take it that way.
Not at all. I'm just confused about this exchange. You mention somecorrect facts ("MySplitter finished in 6 seconds instead of 10"), thensome mistaken suppositions ("a string-specific version could bewritten"), and derive conclusions from both ("Maybe there is a badconstraint somewhere"). Hard to deal with the mix.

I didn't realize the situation. I assumed that there wasn't a version ofsplitter for strings that took array-specific shortcuts. My correctedstatement should read "the existing string-specific version could beimproved."

My conclusion of "Maybe there is a bad constraint somewhere" was notderived from that, it was based on your statement elsewhere that "I wrotea custom splitter specialized for arrays. It's as fast."

Given that my tests have shown I can write a faster one quite easily thanthe implementation selected by phobos, and that you said you wrote onethat's "as fast," I took that to mean you had written one that was moreoptimized than the chosen splitter version (and logically assumed you hadincluded this version in Phobos with the intent it would be chosen whencompiling with strings), leading me to suggest possibly that due to somebug, the "fast" implementation wasn't being chosen. I didn't realize that"as fast" didn't mean "as fast as yours". I actually don't know what youmeant by that now.

My
anecdotal tests with hand writing a custom splitter range to handle the
OP's program gave me a 40% savings. Whether it's find or not, I'm not
sure, but there definitely is room for improvement.
I think I understand where the confusion comes from. If you're referringto MySplitter, that's not comparable. It uses this at the core:
for(; i + 1 < source.length; i++)
{
     if(source[i] == '\r' && source[i + 1] == '\n')
     {
         found = true;
         break;
     }
     ...
}
This is not close to the code that would work with arrays. We couldspecialize things for small arrays, but that hasn't been done yet. Mypoint is it's not comparable with the classic brute force subsequencesearch.

Very good point, here is a new version that takes any string as aseparator:


struct MySplitter
{
    private string s;
    private string separator;
    private string source;
    this(string src, string sep)
    {
        source = src;
        separator = sep;
        popFront();
    }

    @property string front()
    {
        return s;
    }

    @property bool empty()
    {
        return s.ptr == null;
    }

    void popFront()
    {
        if(!source.length)
        {
            s = source;
            source = null;
        }
        else
        {
            size_t i = 0;
            bool found = false;
            for(; i + separator.length <= source.length; i++)
            {
                if(source[i] == separator[0])
                {
                    found = true;

for(size_t j = i+1, k=1; k < separator.length; ++j,++k)

                        if(source[j] != separator[k])
                        {
                            found = false;
                            break;
                        }
                    if(found)
                        break;
                }
            }
            s = source[0..i];
            if(found)
                source = source[i + separator.length..$];
            else
                source = source[$..$];
        }
    }
}

Takes 7 seconds on my machine instead of 6, but not 10 likestd.algorithm.splitter. I don't even like the loop that well, it lookscrude, I can probably optimize it further.


And it does not use any of your specified tricks.

Is this sufficiently comparable, or am I missing something else?

-Steve

Re: Slower than Python

Reply via email to