On Monday, 5 August 2013 at 15:18:42 UTC, John Colvin wrote:
better:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
if(tmp == "doc" || tmp == "docx"
|| tmp == "xls" || tmp == "xlsx"
|| tmp == "ppt" || tmp == "pptx")
{
continue;
}
}
but still not super-fast as (unless the compiler is very
clever) it still means multiple passes over tmp. Also, it
converts the whole string to lower case even when it's not
necessary.
If you have large numbers of possible matches you will probably
want to be clever with your data structures / algorithms. E.g.
You could create a tree-like structure to quickly eliminate
possibilities as you read successive letters. You read one
character, follow the appropriate branch, check if there are
any further branches, if not then no match and break. Else,
read the next character and follow the appropriate branch and
so on.... Infeasible for large (or even medium-sized)
character-sets without hashing, but might be pretty fast for
a-z and a large number of short strings.
Arguably, you'd do even better with:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
switch(tmp)
{
case "doc", "docx":
case "xls", "xlsx":
case "ppt", "pptx":
continue;
default:
}
}
Since it gives the compiler more wiggle room to optimize things
as it wishes. For example, it *could* (who knows :D !) implement
the switch as a hash table, or a tree.
BTW, a very convenient "tree-like" structure that could be used
is a "heap": it is a basic binary tree, but stored inside an
array. You could build it during compilation, and then simply
search it.
A possible optimization is to first switch on string length:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
switch(tmp.length)
{
case 3:
switch(tmp)
{
case "doc", "xls", "ppt":
continue;
default:
}
break;
case 4:
switch(tmp)
{
case "docx", "xlsx", "pptx":
continue;
default:
}
break;
default:
}
}
That said, I'm not even sure this would be faster, so a bench
would be called for. Further more, I'd really be tempted to say
that at this point, we are in the realm of premature optimization.