Re: Summer of Code: Regexp

Bram Moolenaar Sat, 29 Mar 2008 05:44:35 -0700


Andrei Aiordachioaie wrote:

> On Mar 20, 1:30 pm, Bram Moolenaar <[EMAIL PROTECTED]> wrote:
> >
> > Let's do the fast regexp work first.  It's easy to underestimate how
> > much work this stuff is.
> 
> I looked at the updated regexp code that Xiaozhou Liu has maintained,
> and it looks a lot closer to being included. The problems I see so far
> with the new engine are:
> - the three test cases that fail, but of course there may be more bugs
> - compatibility with the old engine.
> 
> >From what I've looked at the test-cases, it seems that the NFA
> implementation is not greedy, as it should be. I will look more into
> it.
> 
> So for the project, I want to extend the test-suite to compare the way
> regexps are handled in the old vs the new engine. Maybe this uncovers
> other bugs. Then, the largest portion of the project would be fixing
> the found bugs. And if that takes little time, I could work on the old
> regexp engine bugs. Do you have any other ideas? Would this be enough
> for a 2.5 months project?

Another big task is to merge the code, removing things that were
duplicated.  My current idea is to first move everything into regexp.c,
then remove the duplicated stuff, then clean it up and perhaps move the
two engines to separate files.  This should be done in small steps,
making sure everything still works after each step.

There currently are quite a few variables global to regexp.c, which
makes this difficult.  One can't simply make them local, passing them
around to function calls will decrease the performance.

> The todo list mentions using regexp search in the gtk find&replace
> dialog. That might also deserve some attention, though I imagine it's
> pretty straightforward.

I would call that a separate task.  The regexp task should better try to
improve the regexp code itself, not the many places where it is used.
The only exception is that the interface should be change to allow for
two results: just checking if there is a match (can be done much quicker
with DFA) and figuring out exactly what text is matched (including sub
matches).  For a Vim script line "if a =~ pattern" we only need the
first.  For a ":s" command we need the second.

-- 
hundred-and-one symptoms of being an internet addict:
178. You look for an icon to double-click to open your bedroom window.

 /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Re: Summer of Code: Regexp

Raspunde prin e-mail lui