Need to talk to Apple people first, one Ping only.

/be

> On Jan 2, 2014, at 3:35 PM, Andreas Gal <[email protected]> wrote:
> 
> 
> Sounds like a solid plan. It combines the best of both worlds (we don't have 
> to reinvent the wheel but we minimize how much code we import). The fact that 
> the code is pretty stable definitely supports this approach.
> 
> Andreas
> 
>> On Jan 2, 2014, at 12:28 PM, Luke Wagner <[email protected]> wrote:
>> 
>> I don't think a pure (2) approach is our cheapest option.  Even with Yarr, 
>> it took Chris a whole bunch of work to import and it also took Dave/Dave a 
>> long time each time they pulled a new version.  It sounds like irregexp 
>> would be much worse.  Furthermore, having a whole hunk of code you can't 
>> just change means everybody goes to lengths to avoid touching it and it 
>> becomes a big sad sinkhole.
>> 
>> Perhaps we could use a modified (2) approach: fork irregexp.  In particular, 
>> we'd:
>> - significantly refactor the code to use SM rooting, assembler, Vector, 
>> LifoAlloc, etc APIs
>> - declare open season on stylistic refactorings to make irregexp match SM
>> 
>> The obvious concern is that we'd miss updates/fixes in V8.  However, looking 
>> at the V8 svn repo, the irregexp files change infrequently (almost nothing 
>> in the last 6 months) so we could just as well, every month or so, just look 
>> at all the changes to the 9 *regexp* files and manually apply the diffs.
>> 
>> One thing, though, is we'd really need an owner for this code who took the 
>> time to fully understand irregexp so they could fix what may come as it came 
>> and review patches.
>> 
>> Cheers,
>> Luke
>> 
>> ----- Original Message -----
>>> Back in 2010, we imported the YARR regular expression engine from JSC [0].
>>> It has served us well over the years, but with all the optimizations to the
>>> rest of the engine, regular expression performance is becoming a bottleneck
>>> again. When YARR is able to JIT a regular expression, performance is mostly
>>> on par with V8. However, when we can't compile a regexp, we're stuck in the
>>> interpreter and become very slow.
>>> 
>>> Unfortunately, YARR is unable to JIT some regular expressions used in
>>> popular JS libraries like jQuery [1]. The main problem is that YARR can't
>>> compile regexps with nested parenthesized groups. As I understand it, this
>>> is a pretty fundamental issue that requires a major refactoring. The
>>> upstream WebKit bug has had no activity for over 3 years [2].
>>> 
>>> There's also a problem with "quantity 1 subpatterns that are copies" that
>>> affects a Peacekeeper email validation regular expression [3] and is the
>>> only reason for us being slower than Chrome on the Peacekeeper
>>> stringValidateForm test [4].
>>> 
>>> To address these issues, we have the following options:
>>> (1) Fix YARR ourselves, either upstream or locally.
>>> (2) Switch from YARR to V8's irregexp engine.
>>> (3) Write something ourselves, probably based on V8's irregexp.
>>> 
>>> (1) will be hard; I don't think we have somebody familiar enough with YARR
>>> to do a refactoring of this size. It could be an option though.
>>> 
>>> For (2), we'd have to write a layer mapping V8's macro assembler calls to
>>> our own macro assembler. Unfortunately, unlike SM and JSC, V8 has more
>>> platform-specific code and we'd have to do this work for different
>>> platforms. I'm not sure what other dependencies there are on other parts of
>>> the V8 engine.
>>> 
>>> Personally, I like (3): it's not a small task, but it'd finally give us a
>>> regexp engine that integrates well with the rest of the engine. This also
>>> means we can dump JSC's macro-assembler (JM used it as well, but is also
>>> gone) and use the one we wrote for the baseline/Ion JITs. It'd also
>>> integrate much better than Yarr in terms of code style and data structures.
>>> If we base it on irregexp, we should be able to avoid most pitfalls or
>>> design problems.
>>> 
>>> What do people think?
>>> 
>>> Jan
>>> 
>>> [0] https://bugzilla.mozilla.org/show_bug.cgi?id=564953
>>> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=929507
>>> [2] https://bugs.webkit.org/show_bug.cgi?id=42264
>>> [3] https://bugs.webkit.org/show_bug.cgi?id=122891
>>> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=692009
>>> _______________________________________________
>>> dev-tech-js-engine-internals mailing list
>>> [email protected]
>>> https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
>> _______________________________________________
>> dev-tech-js-engine-internals mailing list
>> [email protected]
>> https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
> 
> _______________________________________________
> dev-tech-js-engine-internals mailing list
> [email protected]
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Reply via email to