Re: Which option is faster...

JS Mon, 05 Aug 2013 10:36:36 -0700

On Monday, 5 August 2013 at 13:59:24 UTC, jicman wrote:


Greetings!

I have this code,

foreach (...)
{

  if (std.string.tolower(fext[0]) == "doc" ||
    std.string.tolower(fext[0]) == "docx" ||
    std.string.tolower(fext[0]) == "xls" ||
    std.string.tolower(fext[0]) == "xlsx" ||
    std.string.tolower(fext[0]) == "ppt" ||
    std.string.tolower(fext[0]) == "pptx")
   continue;
}

foreach (...)
{
  if (std.string.tolower(fext[0]) == "doc")
    continue;
  if (std.string.tolower(fext[0]) == "docx")
    continue;
  if (std.string.tolower(fext[0]) == "xls")
    continue;
  if (std.string.tolower(fext[0]) == "xlsx")
    continue;
  if (std.string.tolower(fext[0]) == "ppt")
    continue;
  if (std.string.tolower(fext[0]) == "pptx")
   continue;
  ...
  ...
}

thanks.

josé

Both are slow, those trying to optimize out tolower are missingthe boat. A good compiler should optimize that out.

The issue is that n compare are done in those cases with eachsuccessive compare getting slower and slower. if the match is docthen it happens in one compare in both cases. If it's pptx thenit is 6. So if there are many pptx matches then the code willperform slower than for doc.

This can be good or bad behavior depending on if you areguarantee that your order is representative of what actuallyhappens(many doc's few pptx's).

To get O(1) behavior, you must use some sort of jump list. A hashtable can provide such behavior. If there are many comparisonsneeded then at some point this will be, on average, faster.

If performance is critical and the string length is small enoughthen the fastest method is to use a direct lookup table.

For n character of char. It is 256^n. For n = 4, this is 4GB ofmemory! (This is one reason to use a hash table). Since you arenot using the full ascii you could compact this a bit, say 36^n,for n = 4 it is only 1.6MB of memory. The cost is packing thestrings for lookup and computing the index.

One way to speed up the loop is to also compare for what is notused. In this case you can compare just on the first char.



   if (fext[0][0]) == "d")
   {
      ...
   }
   if (fext[0][0]) == "x")
   {
      ...
   }
   if (fext[0][0]) == "p")
   {
      ...
   }

This pares down the results. In this case it is only 3comparisions + 2 comparisons = 5 for pptx rather than 6.

You can try and develop your own hash to get optimal amortizedperformance.

Re: Which option is faster...

Reply via email to