Re: Which option is faster...

monarch_dodra Mon, 05 Aug 2013 10:05:31 -0700

On Monday, 5 August 2013 at 15:18:42 UTC, John Colvin wrote:

better:

foreach (...)
{
    auto tmp = std.string.tolower(fext[0]);
    if(tmp == "doc" || tmp == "docx"
       || tmp == "xls" || tmp == "xlsx"
       || tmp == "ppt" || tmp == "pptx")
    {
        continue;
    }
}
but still not super-fast as (unless the compiler is veryclever) it still means multiple passes over tmp. Also, itconverts the whole string to lower case even when it's notnecessary.
If you have large numbers of possible matches you will probablywant to be clever with your data structures / algorithms. E.g.
You could create a tree-like structure to quickly eliminatepossibilities as you read successive letters. You read onecharacter, follow the appropriate branch, check if there areany further branches, if not then no match and break. Else,read the next character and follow the appropriate branch andso on.... Infeasible for large (or even medium-sized)character-sets without hashing, but might be pretty fast fora-z and a large number of short strings.


Arguably, you'd do even better with:
foreach (...)
{
    auto tmp = std.string.tolower(fext[0]);
    switch(tmp)
    {
        case "doc", "docx":
        case "xls", "xlsx":
        case "ppt", "pptx":
            continue;
        default:
    }
}

Since it gives the compiler more wiggle room to optimize thingsas it wishes. For example, it *could* (who knows :D !) implementthe switch as a hash table, or a tree.

BTW, a very convenient "tree-like" structure that could be usedis a "heap": it is a basic binary tree, but stored inside anarray. You could build it during compilation, and then simplysearch it.


A possible optimization is to first switch on string length:

foreach (...)
{
    auto tmp = std.string.tolower(fext[0]);
    switch(tmp.length)
    {
        case 3:
        switch(tmp)
        {
            case "doc", "xls", "ppt":
                continue;
            default:
        }
        break;

        case 4:
        switch(tmp)
        {
            case "docx", "xlsx", "pptx":
                continue;
            default:
        }
        break;

        default:
    }
}

That said, I'm not even sure this would be faster, so a benchwould be called for. Further more, I'd really be tempted to saythat at this point, we are in the realm of premature optimization.

Re: Which option is faster...

Reply via email to