On Wed, Jul 21, 2010 at 5:54 AM, Robert Haas <robertmh...@gmail.com> wrote:
> This patch still needs some work. It includes a bunch of stylistic > changes that aren't relevant to the purpose of the patch. There's no > reason that I can see to change the existing levenshtein_internal > function to take text arguments instead of char *, or to change !m to > m == 0 in existing code, or to change the whitespace in the comments > of that function. All of those need to be reverted before we can > consider committing this. > I changed arguments of function from char * to text * in order to avoid text_to_cstring call. Same benefit can be achived by replacing char * with char * and length. I changed !m to m == 0 because Itagaki asked me to make it conforming coding style. Do you think there is no reason to fix coding style in existing code? > There is a huge amount of duplicated code here. I think it makes > sense to have the multibyte version of the function be separate, but > perhaps we can merge the less-than-or-equal to bits into the main > code, so that we only have two copies instead of four. Perhaps we > can't just add a max_d argument max_distance to levenshtein_internal; > and if this value is >=0 then it represents the max allowable > distance, but if it is <0 then there is no limit. Sure, that might > slow down the existing code a bit, but it might not be significant. > I'd at least like to see some numbers showing that it is significant > before we go to this much trouble. > In these case we should add many checks of max_d in levenshtein_internal function which make code more complex. Actually, we can merge all four functions into one function. But such function will have many checks about multibyte encoding and max_d. So, I see four cases here: 1) one function with checks for multibyte encoding and max_d 2) two functions with checks for multibyte encoding 3) two functions with checks for max_d 4) four separate functions If you prefer case number 3 you should argue your position little more. > The code doesn't follow the project coding style. Braces must be > uncuddled. Comment paragraphs will be reflowed unless they begin and > end with ------. Function definitions should have the type > declaration on one line and the function name at the start of the > next. > > Freeing memory with pfree is likely a waste of time; is there any > reason not to just rely on the memory context reset, as the original > coding did? > Ok, I'll fix this things. > I think we might need to remove the acknowledgments section from this > code. If everyone who touches this code adds their name, we're > quickly going to have a mess. If we're not going to remove the > acknowledgments section, then please add my name, too, because I've > already patched this code once... > In that case I think we can leave original acknowledgments section. ---- With best regards, Alexander Korotkov.