On Sat, Oct 26, 2013 at 11:38 AM, Peter Bex <peter....@xs4all.nl> wrote:

> On Sat, Oct 26, 2013 at 10:37:36AM -0700, Matt Welland wrote:
> > This regex is so slow that you don't need a timer to see the impact (at
> > least not on my machine with chicken 4.8.0):
> >
> >  (string-match "[a-z][a-z0-9\\-_.]{0,20}"
> "a012345678901234567890123456789")
> >
> > Changing the {0,20} to + makes it run normally fast so I just replaced
> the
> > regex with a string-length and modified the "{0,20}" to "+" . I don't
> > necessarily need a fix for this but it seems like a possible symptom of a
> > deeper problem so I thought I'd report it.
>
> Hi Matt,
>
> Thanks for your report.  I'm afraid this is a known problem with
> Irregex - to avoid producing a state machine with too many states,
> it will always use a backtracking implementation for all repetition
> counts.
>
> I think it's best to take a look at how to fix this upstream first.
> Maybe Alex has an idea of how to do that.
>
> > Pre-compiling the regex didn't seem to make any difference.
>
> That's because the backtracker matches really slowly.
>
> > Just for completeness I compared with Ruby and the ruby equivalent
> > is (in human terms) instant.
> >
> > "a012345678901234567890123456789".match(/[a-z][a-z0-9]{0,20}/)
>
> Thanks for that!  We need more of such real-world test cases.
> Ideally I'd love to have a complete library of regex examples, which
> we can use to optimize irregex further.
>
> By the way, I note that your Ruby regex isn't exactly the same as the
> one you used in CHICKEN.  Does it make a difference if you use the
> same regex?
>

Ah, yes, I did initially compare the exact same regex and saw no
discernible speed impact in Ruby. I did a few variations to see if there
was any sensitivity to ^ or $ at the beginning or end of the regex or
variations to the contents of the character group. This seems to be pretty
much a function of the match count {0,20}.

Some coding for performance hints on the regex page would likely be helpful.



>
> Cheers,
> Peter
> --
> http://www.more-magic.net
>



-- 
Matt
-=-
90% of the nations wealth is held by 2% of the people. Bummer to be in the
majority...
_______________________________________________
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Reply via email to