In message <[EMAIL PROTECTED]>, "Robert Edgar" w rites: >Hi there, >I am trying to load and parse a web server log file (about 3mb on average) >with a regex as follows >"(.*)\\s(.*)\\s(.*)\\s\\[([^\\]]+)\\]\\s\"(.*)\\s(.*)\\s(.*)\"\\s(.*)\\s(.*) >\\s\"(.*)\"\\s\"(.*)\"\\s\"(.*)\"" > >problem is it is takeing forever, I am getting only about 15 lines a second > >code is something like > >while((logEntry = bufferedreader.readLine()) != null){ > if (matcher.contains(logEntry, pattern)) { > MatchResult result=matcher.getMatch(); > } >}
Half of the problem is calling readLine(). You're converting char arrays into strings and then back again when you search for a match. If the files are about 3MB, you'll do better by reading the entire file into a char array first and then searching for matches. The other half of the problem may be the regular expression, which is causing a lot of backtracking. Try replacing .* with \S* and [^\s"]* where appropriate. daniel -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>