Hi,

Here is what I found re regex using test/* testsuite.

   text    data     bss     dec     hex filename
  41385       0      32   41417    a1c9 uClibc/libc/misc/regex/regex.os
  20740       4     293   21037    522d uClibc/libc/misc/regex/regex_old.os

"New" regex results

testregex:
TEST    testregex basic, 533 tests, 6 errors
TEST    testregex categorize, 21 tests, 0 errors
TEST    testregex forcedassoc, 47 tests, 9 errors
TEST    testregex interpretation, 148 tests, 6 signals, 36 errors
TEST    testregex leftassoc, 16 tests, 8 errors
TEST    testregex nullsubexpr, 112 tests, 2 signals, 3 errors
TEST    testregex repetition, 81 tests, 5 errors
TEST    testregex rightassoc, 20 tests, 4 errors

tst-regex2:
test 0 pattern 0 '.?.?.?.?.?.?.?Log\.13'
regexec without REG_NOSUB did not find the correct match
test 0 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
regexec without REG_NOSUB did not find the correct match
test 0 pattern 2 '((((((((((.?))))))))))...
regexec without REG_NOSUB did not find the correct match
test 1 pattern 0 '.?.?.?.?.?.?.?Log\.13'
 0.109511s
test 1 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
 0.110819s
test 1 pattern 2 '((((((((((.?))))))))))...
 0.111298s
test 2 pattern 0 '.?.?.?.?.?.?.?Log\.13'
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)
test 2 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)
test 2 pattern 2 '((((((((((.?))))))))))...
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)
test 3 pattern 0 '.?.?.?.?.?.?.?Log\.13'
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)
test 3 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)
test 3 pattern 2 '((((((((((.?))))))))))...
re_search did not find the correct match(found 'angeLog.13 for earlier changes.
' instead)

tst-regexloc:
match from 0 to 9    [WRONG]

"Old" regex results

testregex:
TEST    testregex basic, 538 tests, 1 error
TEST    testregex categorize, 21 tests, 0 errors
TEST    testregex forcedassoc, 47 tests, 9 errors
TEST    testregex interpretation, 167 tests, 6 signals, 17 errors
TEST    testregex leftassoc, 16 tests, 8 errors
TEST    testregex nullsubexpr, 83 tests, 32 errors
TEST    testregex repetition, 79 tests, 8 errors
TEST    testregex rightassoc, 20 tests, 4 errors

tst-regex2:
test 0 pattern 0 '.?.?.?.?.?.?.?Log\.13'
 1.312488s
test 0 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
 3.084833s
test 0 pattern 2 '((((((((((.?))))))))))...
 19.417234s
test 1 pattern 0 '.?.?.?.?.?.?.?Log\.13'
 1.156149s
test 1 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
 3.210494s
test 1 pattern 2 '((((((((((.?))))))))))...
 19.149326s
test 2 pattern 0 '.?.?.?.?.?.?.?Log\.13'
 1.094042s
test 2 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
 3.047473s
test 2 pattern 2 '((((((((((.?))))))))))...
 19.800388s
test 3 pattern 0 '.?.?.?.?.?.?.?Log\.13'
 1.129805s
test 3 pattern 1 '(.?)(.?)(.?)(.?)(.?)(.?)(.?)Log\.13'
 4.097457s
test 3 pattern 2 '((((((((((.?))))))))))...
 18.404386s

tst-regexloc:
match from 0 to 6  [correct]

Conclusion: so far new regex does not look like an improvement
overall. It has more failures than old one, although there are
cases where it works correctly and old one does not.

It is also twice as big.

Source of both regexp's does not look good. It is not very readable.
Also, there is a lot of cruft: #ifdef emacs ? #if defined _AIX ?,
some glibcy-tasting optimizations ("lets use alloca *and* malloc!
that way the code will be twice as big, and twice as hard to debug!").

REGEX_OLD is slow. tst-regex2 built against glibc takes ~0.022499s
per pattern test, while uclibc takes ~20 seconds (on pattern 3, see test).

New regex works much faster, but it failed
almost every test in tst-regex2!


I guess we need to take a look at new regex and try to fix it.

If it would prove too difficult, I vote for disabling it in config
system. No point in giving people something which is more buggy
and bigger.


I also will try to soup up test infrastructure. So far it looks like
it wasn't used much (lots of failures, problematic behavior
of the testing infrastructure itself, inadequate docs).


On Thursday 11 December 2008 16:51, Bernhard Reutner-Fischer wrote:
> On Thu, Dec 11, 2008 at 02:17:53AM +0100, Denys Vlasenko wrote:
> >On Tuesday 09 December 2008 22:00, Rob Landley wrote:
> >> Right now, there are still two "old" linuxthreads branches in uClibc, and 
> >> as 
> >> far as I can tell we'll be supporting them in perpetuity.  (For a 
> >> definition 
> >> of "support" that involves leaving them alone unless somebody complains.)  
> >
> >We need to stop doing that, though.
> >
> >We have "ond" and "new" vfprintf, "old" and "new" regex,
> >"old" and "new" threads, "old" and "new" fnmatch.
> >
> >Let's just decide on something, and disable things which are
> >really "old". Because currently, I need telepathic powers
> >to figure out which regex to choose:
> >
> >Sometimes "old" is actually old and better be dropped,
> >other times "old" is actually "stable and recommended",
> >and "new" is "work in progress, and maybe developer was hit
> >by the bus, nobody knows". For one, I do not know
> >what vfprintf or regex to choose, I use rand().
> >
> >But users are even less likely than we to know what to choose.
> >We, as developers, need to help them.
> >
> >It's not like disabling or even rm'ing one or the other
> >is irreversible. It all will still be in a history.
> >
> >Just IMHO. Bernhard, I guess it's up to you to decide this.
> 
> Let's leave the complete libpthread/* out of this particular discussion
> for now.
> 
> As for the rest, i currently don't have a strong opinion on which
> of the old or new variants to keep, which malloc impl (of the 3) to
> drop. I'm open to suggestions.
> 
> Any preferences / thoughts?

--
vda
_______________________________________________
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Reply via email to