On 26.03.2012 22:14, Jay Norwood wrote:
The sse4 capabilities include a range mode of string matching,
that lets you match characters in a 16 byte string based on a 16
byte set of start and stop character ranges. See the
_SIDD_CMP_RANGES mode in the table.


For example, the pattern in some of our examples for finding the
start of a word is a-zA-Z, and for other characters in the word
a-zA-Z0-9. Either of these patterns could be tested for match on
a 16 byte input in a single operation in the sse4 engine.

http://msdn.microsoft.com/en-us/library/bb531465.aspx

Nice idea.
Though I can assure you that the biggest problem is not in comparing things at all. The most time is spent accessing various lookup tables, saving/loading state, updating match location and so on. Especially so because of Unicode. Also add an overhead of UTF decoding into the mix (that I plan to avoid). To put it bluntly: R-T version spends around 20% of time doing actual matching at best (averaged, depends on pattern), half of this could be avoided by optimizing "framework" code.
C-T version spends around 35% on on matching, but still the main
inefficiency in the "framework" part of code. C-T matcher admittedly has very simple framework.

Speaking more of run-time version of regex, it is essentially running a VM that executes instructions that do various kinds of match-this, match-that. The VM dispatch code is quite slow, the optimal _threaded_ code requires either doing it in _assembly_ or _computed_ goto in the language. The VM _dispatch_ takes up to 30% of time in the default matcher.


Looking at the msft intrinsics, it seems like the D ones could be
more efficient and elegant looking using D slices, since they are
passing the string and length of string as separate parameters.

It would be good if the D regex processing could detect simple
range match patterns and use the sse4 extensions when available.

There is also an article from intel where they demo use of these
instructions for xml parsing.

http://software.intel.com/en-us/articles/xml-parsing-accelerator-with-intel-streaming-simd-extensions-4-intel-sse4/


Worth looking into.


--
Dmitry Olshansky

Reply via email to