Hi all, I'm Ian, one of the two students working on improving the regexp engine in Vim for this year's Google Summer of Code. I haven't had a whole lot to contribute as of yet, but now that work is underway, I'll probably pop up here asking lots of questions some days.
Right now we're working on getting things set up and building a testing suite, but I thought I would spark some discussion on a design decision that will be coming up after we finish this phase, which is whether to implement the new model ourselves, or use an alternative engine, like TRE: <http://laurikari.net/tre/>. I'm tempted to implement one ourselves, as it's an intellectually stimulating prospect, but that doesn't mean I won't listen to reason if TRE or another option is far better. I don't know much about the internals of TRE, but according to previous posts to this list, it utilizes three engines: a slow one for handling backreferences (presumably similar to Vim's current engine), a fast one for most cases (what we are looking to implement), and one for their 'fuzzy matching' feature. I have a couple questions to start things off. First: I couldn't see much need for 'fuzzy matching' in Vim, but some of you are probably much better acquainted with regexp use cases than I am. Would this be a useful feature to have available? Second: We might have to do some gymnastics to work with multibyte characters, as discussed here: < http://tech.groups.yahoo.com/group/vimdev/message/46408>. I haven't worked with multibyte characters before, so I'm not clear on the subtleties. Would this translation to wide characters before passing to the engine cause much of a performance hit and/or be excessively complicated to implement? On a side note, TRE's main page says it has both wide character and multibyte character support. I couldn't find a version history, so I'm not sure if this is a new feature that Nikolai isn't aware of, or if we need something more. I'm interested to hear what you all have to say. We don't need to make this decision until middle of next week at the earliest, but I thought I would get the discussion going now. Ian