On 15 May 2012 22:15, Piotr Piastucki <[email protected]> wrote: > On 05/14/2012 11:51 PM, Kai Willadsen wrote: >> Also, I do see significant performance gains from using the inline >> optimisations, so if it doesn't affect the quality of matches too >> much, I definitely want to get it in. > > The idea is not to affect the quality at all.
I understand that that's the idea. I just hadn't spent the time to convince myself that it worked as intended. > The following lines in filediff module exclude all matches shorter than 3: > if o[0] == "equal": > if (o[2]-o[1] < 3) or (o[4]-o[3] < 3): > back = o[4]-o[3], o[2]-o[1] > continue > > I changed the preprocessing to match this rule. Right. This is thoroughly sensible. > The default preprocessing in Myers matcher removes all items that do not <snip> Sure, this is great. In fact, I used some of your explanation here to clarify the commit message. > This is more or less what the code > does, it keeps and shifts 2 most recent results (matched_2, matched_1) to > make sure previous characters will not be added more than once when a match > is found. So this was where I wasn't convinced. I just felt like there should be a simpler way of implementing the k-mer checking, so I went ahead and wrote one. It's not the nicest thing out there, but it is (in my opinion) clearer code. In the new version it's also possible to change the k-mer size from 3 by flicking some constants around, if we ever wanted to do so. I did some commit doctoring and I've pushed the results to head. Please let me know if you spot any issues, though it does produce identical results in my testing. Your performance improvement is pretty fantastic - about 330% faster in my tests. I haven't yet updated my Cython branch to match the new logic, but I'm hoping that it should be possible to squeeze another 2x speed-up out of that based on some previous testing. This is definitely an awesome improvement, so thanks! In fact, if enough people hit it without anything breaking, I think this should get cherry-picked onto a 1.6.x release branch. cheers, Kai _______________________________________________ meld-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/meld-list
