> I am writing a multithreaded Apache log parser that uses the Boost
 1_29_0 regex split function to separate elements in the entry. Each
 thread parses a separate log file. The code seems to be working
 correctly on a 1-CPU system, but when I use a 14-CPU Sun server, I
 see massive locking (LCK column of prstat -amLvu username), and
 performance suffers horribly (as measured by the lines processed per
 second). I spent a lot of time checking to see where the locking was
 occurring. I went so far as to compile the code with Sun's Forte 6u2
 and use their analysis tools to identify the problem area. I've
 compiled all code (including Boost) with both gcc 3.2.2 and Forte to
 create 64-bit binaries, if that makes any difference.

 If I read the Forte analysis tools correctly, the place I'm seeing
 all the locking is the call to malloc in the void *operator
 new(unsigned long), which is called by
 boost::re_detail::match_results_base and _priv_match_data. Those are
 in turn called by query_match_aux, which is called by reg_grep2.
 Assuming I'm reading it right...

 At this point it seems like the issue is either with the library or
 my usage of it. Has anyone seen this before? Any pointers on what I
 may be doing wrong and how to fix it would be appreciated.

The looking is occurring in your runtime library rather than boost.regex as such. You have two choices:

1) Use a custom allocator for the match_results class instance that you are
using that uses thread-specific memory pools.

2) Wait for the next release (probably still a couple of months away), which
will use much less dynamic memory allocation (almost none at all in
recursive mode).

John.

In case anyone else is interested, I ended up stumbling across the Hoard project (http://www.hoard.org) that provides an open-source malloc replacement that completely resolved the locking I was seeing.

FYI,

Matt

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Reply via email to