klimek added inline comments.
================ Comment at: clangd/FuzzyMatch.cpp:69 + : NPat(std::min<int>(MaxPat, Pattern.size())), NWord(0), + ScoreScale(0.5f / NPat) { + memcpy(Pat, Pattern.data(), NPat); ---------------- Why .5? ================ Comment at: clangd/FuzzyMatch.cpp:88 + return None; + return ScoreScale * std::min(2 * NPat, std::max<int>(0, Score[NPat][NWord])); +} ---------------- Why 2 * NPat? ================ Comment at: clangd/FuzzyMatch.cpp:92 +// Segmentation of words and patterns +// We mark each character as the Head or Tail of a segment, or a Separator. +// e.g. XMLHttpRequest_Async ---------------- Do you mean "part of the Head or Tail"? Also, explain that these are the CharRoles. A reader reads this first, and will search for what CharRole means in the code later. CharRole is defined in a different file, without comments, so figuring out how that all relates is super hard :) ================ Comment at: clangd/FuzzyMatch.cpp:101-103 +// 2. A characters segment role can be determined by the Types of itself and +// its neighbors. e.g. the middle entry in (Upper, Upper, Lower) is a Head. +// The Empty type is used to represent the start/end of string. ---------------- I think this is the only place where the roles are hinted at. Explain what roles mean and what we need them for. ================ Comment at: clangd/FuzzyMatch.cpp:108 +namespace { +enum CharType { Empty, Lower, Upper, Punctuation }; +// 4 packed CharTypes per entry, indexed by char. ---------------- I'd spell out the numbers, as they are important (here and for CharRole). ================ Comment at: clangd/FuzzyMatch.cpp:110 +// 4 packed CharTypes per entry, indexed by char. +constexpr uint8_t CharTypes[] = { + 0x00, 0x00, 0x00, 0x00, // Control characters ---------------- Finding bugs in these will be hard :) ================ Comment at: clangd/FuzzyMatch.cpp:120 + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, // Bytes over 127 -> Lower. + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, // This includes UTF-8. + 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, ---------------- Can you expand in the comment why this works for utf-8? ================ Comment at: clangd/FuzzyMatch.cpp:137 +} // namespace +void FuzzyMatcher::calculateRoles(const char *Text, CharRole *Out, int N) { + int Types = PackedLookup<CharType>(CharTypes, Text[0]); ---------------- The body of this needs more comments on what it does. I can slowly figure it out by doing bit math, but it should be spelled out what's expected to be in each value at each point. ================ Comment at: clangd/FuzzyMatch.cpp:207 + +ScoreT FuzzyMatcher::matchBonus(int P, int W) { + // Forbidden: matching the first pattern character in the middle of a segment. ---------------- Perhaps add assert(LPat[P] == LWord[W]); ================ Comment at: clangd/FuzzyMatch.cpp:212 + ScoreT S = 0; + // Bonus: pattern is part of a word prefix. + if (P == W) ---------------- Why does P == W imply that? ================ Comment at: clangd/FuzzyMatch.cpp:215 + ++S; + // Bonus: case matches, or an asserted word break matches an actual. + if (Pat[P] == Word[W] || (PatRole[P] == Head && WordRole[W] == Head)) ---------------- This is the first time the term "asserted word break" shows up, perhaps explain this when explaining the roles. ================ Comment at: clangd/FuzzyMatch.cpp:218 + ++S; + // Penalty: matching inside a word where the previous didn't match. + if (WordRole[W] == Tail && P && !Matched[P - 1][W - 1]) ---------------- The previous what didn't match? ================ Comment at: clangd/FuzzyMatch.h:31 +public: + FuzzyMatcher(llvm::StringRef Pattern); + ---------------- Document that patterns larger than MaxPat will be silently cut. ================ Comment at: clangd/FuzzyMatch.h:39 + +private: + constexpr static int MaxPat = 63; ---------------- I find most of the abbreviations here non-intuitive, and thus needing comments (Pat I get is for pattern :) N - what does it mean? Number of characters in the pattern? I'd use Length instead. LPat and LWord (no idea what L could stand for). ================ Comment at: clangd/FuzzyMatch.h:50 + + int NPat, NWord; + char Pat[MaxPat]; ---------------- I'd use a StringRef instead, and call the storage *Storage or something. ================ Comment at: clangd/FuzzyMatch.h:60 + float ScoreScale; + bool IsSubstring; +}; ---------------- Comment that this is not actually used inside the algorithm, just for debugging. https://reviews.llvm.org/D40060 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits