On Mon, Aug 05, 2013 at 07:30:36PM +0200, JS wrote: > On Monday, 5 August 2013 at 13:59:24 UTC, jicman wrote: > > > >Greetings! > > > >I have this code, > > > >foreach (...) > >{ > > > > if (std.string.tolower(fext[0]) == "doc" || > > std.string.tolower(fext[0]) == "docx" || > > std.string.tolower(fext[0]) == "xls" || > > std.string.tolower(fext[0]) == "xlsx" || > > std.string.tolower(fext[0]) == "ppt" || > > std.string.tolower(fext[0]) == "pptx") > > continue; > >} > > > >foreach (...) > >{ > > if (std.string.tolower(fext[0]) == "doc") > > continue; > > if (std.string.tolower(fext[0]) == "docx") > > continue; > > if (std.string.tolower(fext[0]) == "xls") > > continue; > > if (std.string.tolower(fext[0]) == "xlsx") > > continue; > > if (std.string.tolower(fext[0]) == "ppt") > > continue; > > if (std.string.tolower(fext[0]) == "pptx") > > continue; > > ... > > ... > >} > > > >thanks. > > > >josé > > > Both are slow, those trying to optimize out tolower are missing the > boat. A good compiler should optimize that out. > > The issue is that n compare are done in those cases with each > successive compare getting slower and slower. if the match is doc > then it happens in one compare in both cases. If it's pptx then it > is 6. So if there are many pptx matches then the code will perform > slower than for doc. > > > This can be good or bad behavior depending on if you are guarantee > that your order is representative of what actually happens(many > doc's few pptx's). > > To get O(1) behavior, you must use some sort of jump list. A hash > table can provide such behavior. If there are many comparisons > needed then at some point this will be, on average, faster. [...]
There's no need to reinvent the wheel here. std.regex.ctRegex will create the equivalent of jump tables for you. Besides, I doubt the OP's real performance bottleneck is in this code; sounds like the program is I/O bound. Compared to I/O, even repeatedly calling std.string.tolower repeatedly is just roundoff error in terms of total execution time. Using a profiler will reveal the *real* bottleneck. Based on past experience, I'm expecting that it will reveal that the bottleneck isn't anywhere we're predicting it is. :) T -- VI = Visual Irritation