>The code won't compile but it will if references to >classes which have nothing to do with the pattern >matching are removed. If this is no good I'll isolate >the ORO code for you but it will have to be later, >just ask.
There's too much extraneous stuff in there for me to spend the time working through it. Whenever you have the time to whittle it down to the bare essentials, please post again. In the meantime, independent of the ultimate solution, I recommend you not instantiate a new compiler and matcher in parse() every time and that you compile SERVER_ACTIVE_HTML_PATTERN every time parse() is called. You should only compile SERVER_ACTIVE_HTML_PATTERN once and reuse the pattern, otherwise, you are wasting cycles. Likewise, you should only instantiate the matcher once and use it in parse() as needed, otherwise you are again wasting cycles, this time in object creation. The normal way of doing this is to make the compiled pattern a static variable (compiled with READ_ONLY_MASK if it is to be shared between threads) compiled in a static initializer and to make the matcher non-static unless multiple instances of the class will not be used in different threads, in which case making it static will do fine. Finally, the timings must be around individual calls to contains() in order to determine if it is the matching that is consuming the time (change loop to while(true) {...} and time around matcher.contains(), breaking if the result is false and recording each match time in addition to the total for all calls). There's a lot of other stuff going on in MarkedUpHTML() and parse() that is contributing to execution time (forget about the 3,000 seconds; 12 seconds alone is way too much to find 4 matches in an HTML page). All that said, the culprit is still probably a suboptimal regular expression, as you suspected, but it helps to eliminate these other factors that perturb the measurements and our ability to isolate the behavior. daniel