[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-05 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added inline comments.



Comment at: unittests/clangd/QualityTests.cpp:129
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {

sammccall wrote:
> ilya-biryukov wrote:
> > sammccall wrote:
> > > this is not needed, the `#include` is implicit in TestTU
> > > 
> > > (and so you don't need to specify HeaderFilename either)
> > Done.
> > 
> > I did not expect the `TestTU` to do that, actually. Is this implicit 
> > `#include` something we want?
> > Maybe we should just have a default convention for naming the headers 
> > instead and the default `Code` value that only includes the header?
> > E.g. say that a file `test_header.h` is provided by TestTU and let the 
> > tests specify `#include "test_header.h"` if needed. WDYT?
> Missed this - definitely making sure there's a #include in the cc file that 
> lines up to the name/location of the header file is one of the main things I 
> wanted this helper to do. Maybe the documentation can be more clear?
SG, I'd definitely mention that the file is included and the way it's done, 
i.e. via `-include` flag passed to clang.



Comment at: unittests/clangd/QualityTests.cpp:139
+  WithProximity.ProximityScore = 0.2;
+  EXPECT_LT(Default.evaluate(), WithProximity.evaluate());
 }

sammccall wrote:
> nit: can you EXPECT_GT and reverse the order? (default is on the right in all 
> tests above, even the positive signals for symbolquality)
I always use `<` and `<=` in all the code I write. That gives the "natural" 
order of the arguments. (argument to the left is less)
So it's completely opposite for me: I find the `Default.evaluate()` on the left 
side to be more readable.

Feel free to change it, though. Just wanted to point out this style can be more 
or less readable based on the personal preferences.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-04 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: unittests/clangd/QualityTests.cpp:129
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {

ilya-biryukov wrote:
> sammccall wrote:
> > this is not needed, the `#include` is implicit in TestTU
> > 
> > (and so you don't need to specify HeaderFilename either)
> Done.
> 
> I did not expect the `TestTU` to do that, actually. Is this implicit 
> `#include` something we want?
> Maybe we should just have a default convention for naming the headers instead 
> and the default `Code` value that only includes the header?
> E.g. say that a file `test_header.h` is provided by TestTU and let the tests 
> specify `#include "test_header.h"` if needed. WDYT?
Missed this - definitely making sure there's a #include in the cc file that 
lines up to the name/location of the header file is one of the main things I 
wanted this helper to do. Maybe the documentation can be more clear?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-04 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

oops, just saw this is closed - will fix these nits as I'm touching this file 
soon




Comment at: clangd/Quality.h:68
+  /// Proximity between the best declaration and the query location. [0-1] 
score
+  /// where 1 is closest
+  float ProximityScore = 0;

(nit: trailing .)



Comment at: unittests/clangd/QualityTests.cpp:139
+  WithProximity.ProximityScore = 0.2;
+  EXPECT_LT(Default.evaluate(), WithProximity.evaluate());
 }

nit: can you EXPECT_GT and reverse the order? (default is on the right in all 
tests above, even the positive signals for symbolquality)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-04 Thread Phabricator via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE333906: [clangd] Boost scores for decls from current file 
in completion (authored by ibiryukov, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D46943?vs=149765&id=149771#toc

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/QualityTests.cpp

Index: unittests/clangd/QualityTests.cpp
===
--- unittests/clangd/QualityTests.cpp
+++ unittests/clangd/QualityTests.cpp
@@ -58,17 +58,46 @@
 }
 
 TEST(QualityTests, SymbolRelevanceSignalExtraction) {
-  auto AST = TestTU::withHeaderCode(R"cpp(
+  TestTU Test;
+  Test.HeaderCode = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+  Test.Code = R"cpp(
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+
 [[deprecated]]
-int f() { return 0; }
-  )cpp")
- .build();
-
-  SymbolRelevanceSignals Relevance;
-  Relevance.merge(CodeCompletionResult(&findDecl(AST, "f"), /*Priority=*/42,
-   nullptr, false, /*Accessible=*/false));
-  EXPECT_EQ(Relevance.NameMatch, SymbolRelevanceSignals().NameMatch);
-  EXPECT_TRUE(Relevance.Forbidden);
+int test_deprecated() { return 0; }
+  )cpp";
+  auto AST = Test.build();
+
+  SymbolRelevanceSignals Deprecated;
+  Deprecated.merge(CodeCompletionResult(&findDecl(AST, "test_deprecated"),
+/*Priority=*/42, nullptr, false,
+/*Accessible=*/false));
+  EXPECT_EQ(Deprecated.NameMatch, SymbolRelevanceSignals().NameMatch);
+  EXPECT_TRUE(Deprecated.Forbidden);
+
+  // Test proximity scores.
+  SymbolRelevanceSignals FuncInCpp;
+  FuncInCpp.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_cpp"),
+   CCP_Declaration));
+  /// Decls in the current file should get a proximity score of 1.0.
+  EXPECT_FLOAT_EQ(FuncInCpp.ProximityScore, 1.0);
+
+  SymbolRelevanceSignals FuncInHeader;
+  FuncInHeader.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_header"),
+  CCP_Declaration));
+  /// Decls outside current file currently don't get a proximity score boost.
+  EXPECT_FLOAT_EQ(FuncInHeader.ProximityScore, 0.0);
+
+  SymbolRelevanceSignals FuncInHeaderAndCpp;
+  FuncInHeaderAndCpp.merge(CodeCompletionResult(
+  &findDecl(AST, "test_func_in_header_and_cpp"), CCP_Declaration));
+  /// Decls in both header **and** the main file get the same boost.
+  EXPECT_FLOAT_EQ(FuncInHeaderAndCpp.ProximityScore, 1.0);
 }
 
 // Do the signals move the scores in the direction we expect?
@@ -104,6 +133,10 @@
   SymbolRelevanceSignals PoorNameMatch;
   PoorNameMatch.NameMatch = 0.2f;
   EXPECT_LT(PoorNameMatch.evaluate(), Default.evaluate());
+
+  SymbolRelevanceSignals WithProximity;
+  WithProximity.ProximityScore = 0.2;
+  EXPECT_LT(Default.evaluate(), WithProximity.evaluate());
 }
 
 TEST(QualityTests, SortText) {
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -8,6 +8,8 @@
 //===-===//
 #include "Quality.h"
 #include "index/Index.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/MathExtras.h"
@@ -17,9 +19,18 @@
 namespace clangd {
 using namespace llvm;
 
+static bool hasDeclInMainFile(const Decl &D) {
+  auto &SourceMgr = D.getASTContext().getSourceManager();
+  for (auto *Redecl : D.redecls()) {
+auto Loc = SourceMgr.getSpellingLoc(Redecl->getLocation());
+if (SourceMgr.isWrittenInMainFile(Loc))
+  return true;
+  }
+  return false;
+}
+
 void SymbolQualitySignals::merge(const CodeCompletionResult &SemaCCResult) {
   SemaCCPriority = SemaCCResult.Priority;
-
   if (SemaCCResult.Availability == CXAvailability_Deprecated)
 Deprecated = true;
 }
@@ -60,12 +71,25 @@
   if (SemaCCResult.Availability == CXAvailability_NotAvailable ||
   SemaCCResult.Availability == CXAvailability_NotAccessible)
 Forbidden = true;
+
+  if (SemaCCResult.Declaration) {
+// We boost things that have decls in the main file.
+// The real proximity scores would be more general when we have them.
+float DeclProximity =
+hasDeclInMainFile(*SemaCCResult.Declaration) ? 1.0 : 0.0;
+ProximityScore = std::max(DeclProximity, ProximityScore);
+  }
 }
 
 float SymbolRelevanceSignals::evaluate() const {
   if (Forbidden)
 return 0;
-  return NameMatch;
+
+  float Score = NameMatch;
+  // Proximity scores are [0,1] and we translate them into a multiplier in the
+  // range from 1 to 2.
+  Score *= 

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-04 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added inline comments.



Comment at: clangd/Quality.h:52
   unsigned References = 0;
+  float ProximityScore = 0.0; /// Proximity score, in a [0,1] interval.
 

sammccall wrote:
> this belongs in `SymbolRelevanceSignals` rather than this struct. In 
> principle, SymbolQualitySignals should all be stuff we can store in a global 
> index (which is the point of the split). I should probably add a comment to 
> that effect :-)
> 
> You could be a little more specific on the semantics, e.g. "Proximity between 
> the best declaration and the query location. [0-1] score where 1 is closest."
Hah, I totally missed the `SymbolRelevanceSignals`. Thanks, moved it there and 
added a comment per suggestions.



Comment at: unittests/clangd/QualityTests.cpp:121
 
+TEST(QualityTests, BoostCurrentFileDecls) {
+  TestTU Test;

sammccall wrote:
> consider merging into SymbolRelevanceSignalsExtraction test, which tests the 
> same entrypoint.
> 
> If not, move up next to that one.
Merged them together. It is now a bit more verbose. In case you have more 
suggestions on how to properly test this, I'm happy to address them in a 
follow-up change.



Comment at: unittests/clangd/QualityTests.cpp:129
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {

sammccall wrote:
> this is not needed, the `#include` is implicit in TestTU
> 
> (and so you don't need to specify HeaderFilename either)
Done.

I did not expect the `TestTU` to do that, actually. Is this implicit `#include` 
something we want?
Maybe we should just have a default convention for naming the headers instead 
and the default `Code` value that only includes the header?
E.g. say that a file `test_header.h` is provided by TestTU and let the tests 
specify `#include "test_header.h"` if needed. WDYT?



Comment at: unittests/clangd/TestTU.cpp:80
   continue;
-if (Result) {
+if (Result && ND->getCanonicalDecl() != Result) {
   ADD_FAILURE() << "Multiple Decls named " << QName;

sammccall wrote:
> well, I definitely wanted to flag this as an error (for the tests where this 
> function was introduced).
> 
> Actually I think this is wrong for your test anyway - don't you want to test 
> that no matter which decl the code completion result refers to, you get the 
> same boost?
> 
> I'd suggest adding a `findDecls()` function that returns a 
> vector, and implementing `findDecl()` in terms of it. In the 
> `test_func_in_header_and_cpp` test, you could loop over `findDecls()` and run 
> the assertions each time.
> 
> (I guess findDecls() should assert that the returned vector is non-empty? 
> Slightly weird but might catch accidentally vacuous tests)
This function got reimplemented in one of the recent changes and works for 
header decls too now.
I actually think it's fine if it returns one of the overladed Decls in case 
there are multiple overloads (i.e. multiple Decls in the AST) for the **same** 
function. It does assert for multiple overloads of the same thing currently, 
which is probably what we want.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-06-04 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov updated this revision to Diff 149765.
ilya-biryukov marked 5 inline comments as done.
ilya-biryukov added a comment.

- Address review comments


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/QualityTests.cpp

Index: unittests/clangd/QualityTests.cpp
===
--- unittests/clangd/QualityTests.cpp
+++ unittests/clangd/QualityTests.cpp
@@ -58,17 +58,46 @@
 }
 
 TEST(QualityTests, SymbolRelevanceSignalExtraction) {
-  auto AST = TestTU::withHeaderCode(R"cpp(
+  TestTU Test;
+  Test.HeaderCode = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+  Test.Code = R"cpp(
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+
 [[deprecated]]
-int f() { return 0; }
-  )cpp")
- .build();
-
-  SymbolRelevanceSignals Relevance;
-  Relevance.merge(CodeCompletionResult(&findDecl(AST, "f"), /*Priority=*/42,
-   nullptr, false, /*Accessible=*/false));
-  EXPECT_EQ(Relevance.NameMatch, SymbolRelevanceSignals().NameMatch);
-  EXPECT_TRUE(Relevance.Forbidden);
+int test_deprecated() { return 0; }
+  )cpp";
+  auto AST = Test.build();
+
+  SymbolRelevanceSignals Deprecated;
+  Deprecated.merge(CodeCompletionResult(&findDecl(AST, "test_deprecated"),
+/*Priority=*/42, nullptr, false,
+/*Accessible=*/false));
+  EXPECT_EQ(Deprecated.NameMatch, SymbolRelevanceSignals().NameMatch);
+  EXPECT_TRUE(Deprecated.Forbidden);
+
+  // Test proximity scores.
+  SymbolRelevanceSignals FuncInCpp;
+  FuncInCpp.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_cpp"),
+   CCP_Declaration));
+  /// Decls in the current file should get a proximity score of 1.0.
+  EXPECT_FLOAT_EQ(FuncInCpp.ProximityScore, 1.0);
+
+  SymbolRelevanceSignals FuncInHeader;
+  FuncInHeader.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_header"),
+  CCP_Declaration));
+  /// Decls outside current file currently don't get a proximity score boost.
+  EXPECT_FLOAT_EQ(FuncInHeader.ProximityScore, 0.0);
+
+  SymbolRelevanceSignals FuncInHeaderAndCpp;
+  FuncInHeaderAndCpp.merge(CodeCompletionResult(
+  &findDecl(AST, "test_func_in_header_and_cpp"), CCP_Declaration));
+  /// Decls in both header **and** the main file get the same boost.
+  EXPECT_FLOAT_EQ(FuncInHeaderAndCpp.ProximityScore, 1.0);
 }
 
 // Do the signals move the scores in the direction we expect?
@@ -104,6 +133,10 @@
   SymbolRelevanceSignals PoorNameMatch;
   PoorNameMatch.NameMatch = 0.2f;
   EXPECT_LT(PoorNameMatch.evaluate(), Default.evaluate());
+
+  SymbolRelevanceSignals WithProximity;
+  WithProximity.ProximityScore = 0.2;
+  EXPECT_LT(Default.evaluate(), WithProximity.evaluate());
 }
 
 TEST(QualityTests, SortText) {
Index: clangd/Quality.h
===
--- clangd/Quality.h
+++ clangd/Quality.h
@@ -64,6 +64,9 @@
   // 0-1 fuzzy-match score for unqualified name. Must be explicitly assigned.
   float NameMatch = 1;
   bool Forbidden = false; // Unavailable (e.g const) or inaccessible (private).
+  /// Proximity between the best declaration and the query location. [0-1] score
+  /// where 1 is closest
+  float ProximityScore = 0;
 
   void merge(const CodeCompletionResult &SemaResult);
 
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -8,6 +8,8 @@
 //===-===//
 #include "Quality.h"
 #include "index/Index.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/MathExtras.h"
@@ -17,9 +19,18 @@
 namespace clangd {
 using namespace llvm;
 
+static bool hasDeclInMainFile(const Decl &D) {
+  auto &SourceMgr = D.getASTContext().getSourceManager();
+  for (auto *Redecl : D.redecls()) {
+auto Loc = SourceMgr.getSpellingLoc(Redecl->getLocation());
+if (SourceMgr.isWrittenInMainFile(Loc))
+  return true;
+  }
+  return false;
+}
+
 void SymbolQualitySignals::merge(const CodeCompletionResult &SemaCCResult) {
   SemaCCPriority = SemaCCResult.Priority;
-
   if (SemaCCResult.Availability == CXAvailability_Deprecated)
 Deprecated = true;
 }
@@ -60,12 +71,25 @@
   if (SemaCCResult.Availability == CXAvailability_NotAvailable ||
   SemaCCResult.Availability == CXAvailability_NotAccessible)
 Forbidden = true;
+
+  if (SemaCCResult.Declaration) {
+// We boost things that have decls in the main file.
+// The real proximity scores would be more general when we have them.
+float DeclProx

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-24 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Nice, simple and will admit refinements later.

Just test nits and a trivial organizational thing.




Comment at: clangd/Quality.cpp:22
 
+namespace {
+bool hasDeclInMainFile(const Decl &D) {

nit: per coding style use static for functions
(I'm not sure it's a great rule, but since the ns only has this function...)



Comment at: clangd/Quality.h:52
   unsigned References = 0;
+  float ProximityScore = 0.0; /// Proximity score, in a [0,1] interval.
 

this belongs in `SymbolRelevanceSignals` rather than this struct. In principle, 
SymbolQualitySignals should all be stuff we can store in a global index (which 
is the point of the split). I should probably add a comment to that effect :-)

You could be a little more specific on the semantics, e.g. "Proximity between 
the best declaration and the query location. [0-1] score where 1 is closest."



Comment at: unittests/clangd/QualityTests.cpp:96
 
 TEST(QualityTests, SymbolRelevanceSignalsSanity) {
   SymbolRelevanceSignals Default;

please add a test for proximity here



Comment at: unittests/clangd/QualityTests.cpp:121
 
+TEST(QualityTests, BoostCurrentFileDecls) {
+  TestTU Test;

consider merging into SymbolRelevanceSignalsExtraction test, which tests the 
same entrypoint.

If not, move up next to that one.



Comment at: unittests/clangd/QualityTests.cpp:129
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {

this is not needed, the `#include` is implicit in TestTU

(and so you don't need to specify HeaderFilename either)



Comment at: unittests/clangd/QualityTests.cpp:155
+
+  /// Check the boost in proximity translates into a better score.
+  EXPECT_LE(FuncInHeader.evaluate(), FuncInCpp.evaluate());

this tests end-to-end, but the other tests verify input -> signals and signal 
-> score separately.

I'd prefer to keep (only) doing that, for consistency and because it's 
important we know/assert precisely what each half does so we can actually debug.



Comment at: unittests/clangd/TestTU.cpp:80
   continue;
-if (Result) {
+if (Result && ND->getCanonicalDecl() != Result) {
   ADD_FAILURE() << "Multiple Decls named " << QName;

well, I definitely wanted to flag this as an error (for the tests where this 
function was introduced).

Actually I think this is wrong for your test anyway - don't you want to test 
that no matter which decl the code completion result refers to, you get the 
same boost?

I'd suggest adding a `findDecls()` function that returns a vector, 
and implementing `findDecl()` in terms of it. In the 
`test_func_in_header_and_cpp` test, you could loop over `findDecls()` and run 
the assertions each time.

(I guess findDecls() should assert that the returned vector is non-empty? 
Slightly weird but might catch accidentally vacuous tests)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-24 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov updated this revision to Diff 148381.
ilya-biryukov added a comment.

- Simplify the change by boosting any Decls that have a redecl in the current 
file


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/QualityTests.cpp
  unittests/clangd/TestTU.cpp

Index: unittests/clangd/TestTU.cpp
===
--- unittests/clangd/TestTU.cpp
+++ unittests/clangd/TestTU.cpp
@@ -1,5 +1,4 @@
-//===--- TestTU.cpp - Scratch source files for testing *-
-//C++-*-===//
+//===--- TestTU.cpp - Scratch source files for testing ---===//
 //
 // The LLVM Compiler Infrastructure
 //
@@ -78,11 +77,11 @@
 auto *ND = dyn_cast(D);
 if (!ND || ND->getNameAsString() != QName)
   continue;
-if (Result) {
+if (Result && ND->getCanonicalDecl() != Result) {
   ADD_FAILURE() << "Multiple Decls named " << QName;
   assert(false && "QName is not unique");
 }
-Result = ND;
+Result = cast(ND->getCanonicalDecl());
   }
   if (!Result) {
 ADD_FAILURE() << "No Decl named " << QName << " in AST";
Index: unittests/clangd/QualityTests.cpp
===
--- unittests/clangd/QualityTests.cpp
+++ unittests/clangd/QualityTests.cpp
@@ -118,6 +118,45 @@
   EXPECT_LT(sortText(0, "a"), sortText(0, "z"));
 }
 
+TEST(QualityTests, BoostCurrentFileDecls) {
+  TestTU Test;
+  Test.HeaderFilename = "foo.h";
+  Test.HeaderCode = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+  )cpp";
+
+  ParsedAST AST = Test.build();
+
+  SymbolQualitySignals FuncInCpp;
+  FuncInCpp.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_cpp"),
+   CCP_Declaration));
+  /// Decls in the current file should get a proximity score of 1.0.
+  EXPECT_FLOAT_EQ(FuncInCpp.ProximityScore, 1.0);
+
+  SymbolQualitySignals FuncInHeader;
+  FuncInHeader.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_header"),
+  CCP_Declaration));
+  /// Decls outside current file currently don't get a proximity score boost.
+  EXPECT_FLOAT_EQ(FuncInHeader.ProximityScore, 0.0);
+
+  SymbolQualitySignals FuncInHeaderAndCpp;
+  FuncInHeaderAndCpp.merge(CodeCompletionResult(
+  &findDecl(AST, "test_func_in_header_and_cpp"), CCP_Declaration));
+  /// Decls in both header **and** the main file get the same boost.
+  EXPECT_FLOAT_EQ(FuncInHeaderAndCpp.ProximityScore, 1.0);
+
+  /// Check the boost in proximity translates into a better score.
+  EXPECT_LE(FuncInHeader.evaluate(), FuncInCpp.evaluate());
+  EXPECT_FLOAT_EQ(FuncInHeaderAndCpp.evaluate(), FuncInCpp.evaluate());
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
Index: clangd/Quality.h
===
--- clangd/Quality.h
+++ clangd/Quality.h
@@ -49,6 +49,7 @@
//quality and relevance. Untangle this.
   bool Deprecated = false;
   unsigned References = 0;
+  float ProximityScore = 0.0; /// Proximity score, in a [0,1] interval.
 
   void merge(const CodeCompletionResult &SemaCCResult);
   void merge(const Symbol &IndexResult);
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -8,6 +8,8 @@
 //===-===//
 #include "Quality.h"
 #include "index/Index.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/MathExtras.h"
@@ -17,9 +19,27 @@
 namespace clangd {
 using namespace llvm;
 
+namespace {
+bool hasDeclInMainFile(const Decl &D) {
+  auto &SourceMgr = D.getASTContext().getSourceManager();
+  for (auto *Redecl : D.redecls()) {
+auto Loc = SourceMgr.getSpellingLoc(Redecl->getLocation());
+if (SourceMgr.isWrittenInMainFile(Loc))
+  return true;
+  }
+  return false;
+}
+} // namespace
+
 void SymbolQualitySignals::merge(const CodeCompletionResult &SemaCCResult) {
   SemaCCPriority = SemaCCResult.Priority;
-
+  if (SemaCCResult.Declaration) {
+// We boost things that have decls in the main file.
+// The real proximity scores would be more general when we have them.
+float DeclProximity =
+hasDeclInMainFile(*SemaCCResult.Declaration) ? 1.0 : 0.0;
+ProximityScore = std::max(DeclProximity, ProximityScore);
+  }
   if (SemaCCResult.Availability == CXAvailability_Deprecated)
 Deprecated = true;
 }
@@ -41,6 +61,10 @@
 // Priority 80 is a really bad score.
 Score *= 2 - std::min

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-24 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added a comment.

After an offline chat: we decided to boost sema results that have **any** decls 
in the main file. Even though it introduces some unwanted inconsistencies in 
some cases (e.g. see the comment), most of us agree that's a very simple 
implementation that does boost useful things, including stuff from association 
header.
To get better results, we need to design and build the real proximity scoring 
that handles associated headers and other things.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-23 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added a comment.

In https://reviews.llvm.org/D46943#1109199, @ioeric wrote:

> > - Symbols store their paths as URIs ⇒ we need to parse them in order to 
> > apply heuristics. We could avoid that by writing a version of 
> > header-matching that also works on URIs, but that would mean more 
> > complexity.
>
> Yeah, this is a bit unfortunate. It's probably OK to parse URIs; they are not 
> that expensive after all (we might want to do some measurement though).


Yeah. I really wish we had microbenchmarks for things like completion.

>> - Merging quality signals from headers now requires an extra paramater: name 
>> of the source file. I wonder if it's worth extracting builders for symbol 
>> qualities into separate classes to keep the current nice signatures, i.e. 
>> `merge(Symbol& IndexResult)`.
> 
> I'm not very familiar with `SymbolQualitySignals`. But if we decide to use 
> main file name as a signal, it might make sense to pass it in through the 
> constructor?

That's possible, but that essentially turns `SymbolQualitySignals` from a data 
class to a stateful builder.

>> - How should we match the header with the main file?  Our options are:
>>   - (proposed by Eric) use main file regex from clang-format for that. I'm 
>> not entirely sure it suits us well, since the regex is designed to work on 
>> paths inside #include directive, but we're getting ours from the Symbols and 
>> Sema AST Decls. Moreover, it means we're gonna read .clang-format files to 
>> get that style.
> 
> I think the ".clang-format problem" is not specific to the header matching 
> here. We would eventually need proper format style support in clangd anyway, 
> as clangd provides formatting features (e.g. reformat and include insertion).

Yeah, `IncludeMainRegex` does look like a useful setting from clang-format. And 
maybe using clang-format settings is a good idea there. I'm just a bit weary of 
adding this logic in this change along with the completion changes.
So I'd go with a simple heuristic for starters to solve a problem at hand. 
Happy to improve it to use clang-format regex with a follow-up change if 
everyone agrees that's a good idea. I mostly feel uneasy about the added 
complexity to the current change.




Comment at: clangd/MatchingHeader.cpp:44
+  auto HeaderNoExt = llvm::sys::path::filename(Header);
+  return SourceNoExt.equals_lower(HeaderNoExt);
+}

ioeric wrote:
> Why `equals_lower`?
A former-windows-developer habbit. I don't think it hurts in any way if we do 
`equals_lower` here, we'll merely work in those strange cases where the 
extensions are not lower-case.
This is also consistent with isHeaderFile/isSourceFile (moved from ClangdServer)




Comment at: clangd/MatchingHeader.h:1
+//===--- MatchingHeader.h - Match source and header files *- 
C++-*-===//
+//

ioeric wrote:
> I wonder if we could merge this into Headers.h
Thanks, I totally forgot we have `Headers.h`.  Will move the helpers there.



Comment at: clangd/MatchingHeader.h:24
+/// header for a \p Source.
+bool isMainHeaderFor(PathRef Header, PathRef Source);
+

ioeric wrote:
> We might want to briefly explain what a matching header is and what the 
> heuristics are.
My idea was to not elaborate before we agree on what those heuristics should be.
Given the heuristics are so simple and there are suggestions to change them, 
documenting the current behavior seems like a bad idea. I'd rather keep them a 
black box for now.
Does that make sense? Maybe add a comment that the heuristics are likely to 
change, so the users shouldn't rely on them too much?



Comment at: clangd/Quality.cpp:28-29
+struct DeclFiles {
+  bool AllDeclsInMainFile = false;
+  bool HasDeclsInMatchingHeader = false;
+};

ioeric wrote:
> Could we merge these two flags into something like 
> `IsDeclaredInMainHeaderOrFile`?
The two flags are built differently.
**Any** decl in the matching header gives a boost. Otherwise, **all** decls 
should be in the main file to get a boost.
The second one is built differently for the reasons outlined in the previous 
comments on why it might not be the best idea to boost completion items that 
have one of the inside the current file:
- It gives inconsistent ranking for different completion points (before/after 
declaration)
- The fact that a function has definition in the current file does not 
necessarily mean I want to call it more often than the others.



Comment at: clangd/Quality.cpp:40
+  assert(MainFile);
+  for (auto *Redecl : D->redecls()) {
+auto Loc = SM.getSpellingLoc(Redecl->getLocation());

ioeric wrote:
> I wonder if it's still necessary to check all `redecls`. Would it be 
> sufficient to check `D`, if `D` is the decl we referencing to?
I don't think it's sufficient. How could we compute the flags that this 
function returns by looking at just

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-23 Thread Eric Liu via Phabricator via cfe-commits
ioeric added a comment.

In https://reviews.llvm.org/D46943#1107880, @ilya-biryukov wrote:

> I've added an initial version of testing for the matching header and wanted 
> to get feedback before proceeding further with tests and other changes.
>
> A few things that bug me so far:
>
> - We need to match headers of items from the index, not only from the Sema 
> results.


This sounds reasonable.

> - Symbols store their paths as URIs ⇒ we need to parse them in order to apply 
> heuristics. We could avoid that by writing a version of header-matching that 
> also works on URIs, but that would mean more complexity.

Yeah, this is a bit unfortunate. It's probably OK to parse URIs; they are not 
that expensive after all (we might want to do some measurement though).

> - Merging quality signals from headers now requires an extra paramater: name 
> of the source file. I wonder if it's worth extracting builders for symbol 
> qualities into separate classes to keep the current nice signatures, i.e. 
> `merge(Symbol& IndexResult)`.

I'm not very familiar with `SymbolQualitySignals`. But if we decide to use main 
file name as a signal, it might make sense to pass it in through the 
constructor?

> - How should we match the header with the main file?  Our options are:
>   - (proposed by Eric) use main file regex from clang-format for that. I'm 
> not entirely sure it suits us well, since the regex is designed to work on 
> paths inside #include directive, but we're getting ours from the Symbols and 
> Sema AST Decls. Moreover, it means we're gonna read .clang-format files to 
> get that style.

I think the ".clang-format problem" is not specific to the header matching 
here. We would eventually need proper format style support in clangd anyway, as 
clangd provides formatting features (e.g. reformat and include insertion).

> - Come up with our own heuristics. There is a similar place in ClangdServer 
> that matches a header with source and back. We could extend those heuristics 
> to also allow figuring out whether the paths are matching header/source. I 
> chose this option for initial implementation, since it's less work and it 
> seems easier to switch to clang-format's regex later.

Not against this option. Just want to point out that the heuristics would not 
work for test files (that include tested main headers) with the current 
matching.




Comment at: clangd/MatchingHeader.cpp:44
+  auto HeaderNoExt = llvm::sys::path::filename(Header);
+  return SourceNoExt.equals_lower(HeaderNoExt);
+}

Why `equals_lower`?



Comment at: clangd/MatchingHeader.h:1
+//===--- MatchingHeader.h - Match source and header files *- 
C++-*-===//
+//

I wonder if we could merge this into Headers.h



Comment at: clangd/MatchingHeader.h:24
+/// header for a \p Source.
+bool isMainHeaderFor(PathRef Header, PathRef Source);
+

We might want to briefly explain what a matching header is and what the 
heuristics are.



Comment at: clangd/Quality.cpp:28-29
+struct DeclFiles {
+  bool AllDeclsInMainFile = false;
+  bool HasDeclsInMatchingHeader = false;
+};

Could we merge these two flags into something like 
`IsDeclaredInMainHeaderOrFile`?



Comment at: clangd/Quality.cpp:40
+  assert(MainFile);
+  for (auto *Redecl : D->redecls()) {
+auto Loc = SM.getSpellingLoc(Redecl->getLocation());

I wonder if it's still necessary to check all `redecls`. Would it be sufficient 
to check `D`, if `D` is the decl we referencing to?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-22 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added a comment.

I've added an initial version of testing for the matching header and wanted to 
get feedback before proceeding further with tests and other changes.

A few things that bug me so far:

- We need to match headers of items from the index, not only from the Sema 
results.
- Symbols store their paths as URIs ⇒ we need to parse them in order to apply 
heuristics. We could avoid that by writing a version of header-matching that 
also works on URIs, but that would mean more complexity.
- Merging quality signals from headers now requires an extra paramater: name of 
the source file. I wonder if it's worth extracting builders for symbol 
qualities into separate classes to keep the current nice signatures, i.e. 
`merge(Symbol& IndexResult)`.
- How should we match the header with the main file?  Our options are:
  - (proposed by Eric) use main file regex from clang-format for that. I'm not 
entirely sure it suits us well, since the regex is designed to work on paths 
inside #include directive, but we're getting ours from the Symbols and Sema AST 
Decls. Moreover, it means we're gonna read .clang-format files to get that 
style.
  - Come up with our own heuristics. There is a similar place in ClangdServer 
that matches a header with source and back. We could extend those heuristics to 
also allow figuring out whether the paths are matching header/source. I chose 
this option for initial implementation, since it's less work and it seems 
easier to switch to clang-format's regex later.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-22 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov updated this revision to Diff 148006.
ilya-biryukov added a comment.
Herald added a subscriber: mgorny.

- Move helpers tha switch header and source into separate files.
- Also uprank items coming from the matching headers


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/AST.cpp
  clangd/AST.h
  clangd/CMakeLists.txt
  clangd/ClangdServer.cpp
  clangd/CodeComplete.cpp
  clangd/MatchingHeader.cpp
  clangd/MatchingHeader.h
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/QualityTests.cpp
  unittests/clangd/TestTU.cpp

Index: unittests/clangd/TestTU.cpp
===
--- unittests/clangd/TestTU.cpp
+++ unittests/clangd/TestTU.cpp
@@ -1,5 +1,4 @@
-//===--- TestTU.cpp - Scratch source files for testing *-
-//C++-*-===//
+//===--- TestTU.cpp - Scratch source files for testing ---===//
 //
 // The LLVM Compiler Infrastructure
 //
@@ -78,11 +77,11 @@
 auto *ND = dyn_cast(D);
 if (!ND || ND->getNameAsString() != QName)
   continue;
-if (Result) {
+if (Result && ND->getCanonicalDecl() != Result) {
   ADD_FAILURE() << "Multiple Decls named " << QName;
   assert(false && "QName is not unique");
 }
-Result = ND;
+Result = cast(ND->getCanonicalDecl());
   }
   if (!Result) {
 ADD_FAILURE() << "No Decl named " << QName << " in AST";
Index: unittests/clangd/QualityTests.cpp
===
--- unittests/clangd/QualityTests.cpp
+++ unittests/clangd/QualityTests.cpp
@@ -37,15 +37,15 @@
   auto AST = Header.build();
 
   SymbolQualitySignals Quality;
-  Quality.merge(findSymbol(Symbols, "x"));
+  Quality.merge(findSymbol(Symbols, "x"), /*MainFilename=*/"foo.cpp");
   EXPECT_FALSE(Quality.Deprecated);
   EXPECT_EQ(Quality.SemaCCPriority, SymbolQualitySignals().SemaCCPriority);
   EXPECT_EQ(Quality.References, SymbolQualitySignals().References);
 
   Symbol F = findSymbol(Symbols, "f");
   F.References = 24; // TestTU doesn't count references, so fake it.
   Quality = {};
-  Quality.merge(F);
+  Quality.merge(F, /*MainFilename=*/"foo.cpp");
   EXPECT_FALSE(Quality.Deprecated); // FIXME: Include deprecated bit in index.
   EXPECT_EQ(Quality.SemaCCPriority, SymbolQualitySignals().SemaCCPriority);
   EXPECT_EQ(Quality.References, 24u);
@@ -118,6 +118,45 @@
   EXPECT_LT(sortText(0, "a"), sortText(0, "z"));
 }
 
+TEST(QualityTests, BoostCurrentFileDecls) {
+  TestTU Test;
+  Test.HeaderFilename = "foo.h";
+  Test.HeaderCode = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+
+int test() {
+  test_func_^
+}
+  )cpp";
+
+  ParsedAST AST = Test.build();
+
+  SymbolQualitySignals FuncInCpp;
+  FuncInCpp.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_cpp"),
+   CCP_Declaration));
+  EXPECT_TRUE(FuncInCpp.AllDeclsInMainFile);
+
+  SymbolQualitySignals FuncInHeader;
+  FuncInHeader.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_header"),
+  CCP_Declaration));
+  EXPECT_FALSE(FuncInHeader.AllDeclsInMainFile);
+
+  SymbolQualitySignals FuncInHeaderAndCpp;
+  FuncInHeaderAndCpp.merge(CodeCompletionResult(
+  &findDecl(AST, "test_func_in_header_and_cpp"), CCP_Declaration));
+  EXPECT_FALSE(FuncInHeaderAndCpp.AllDeclsInMainFile);
+
+  EXPECT_LE(FuncInHeader.evaluate(), FuncInCpp.evaluate());
+  EXPECT_FLOAT_EQ(FuncInHeader.evaluate(), FuncInHeaderAndCpp.evaluate());
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
Index: clangd/Quality.h
===
--- clangd/Quality.h
+++ clangd/Quality.h
@@ -47,11 +47,13 @@
   unsigned SemaCCPriority = 0; // 1-80, 1 is best. 0 means absent.
// FIXME: this is actually a mix of symbol
//quality and relevance. Untangle this.
+  bool AllDeclsInMainFile = true;
+  bool HasDeclsInMatchingHeader = false;
   bool Deprecated = false;
   unsigned References = 0;
 
   void merge(const CodeCompletionResult &SemaCCResult);
-  void merge(const Symbol &IndexResult);
+  void merge(const Symbol &IndexResult, llvm::StringRef MainFilename);
 
   // Condense these signals down to a single number, higher is better.
   float evaluate() const;
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -7,7 +7,13 @@
 //
 //===-===//
 #include "Quality.h"
+#include "AST.h"
+#include "Logger.h"
+#include "MatchingHeader.h"
+#include "URI.h"
 #include "index/Index.h"
+#include "clang/AST/ASTContext.h"
+#include "cl

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added inline comments.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

ioeric wrote:
> ioeric wrote:
> > ilya-biryukov wrote:
> > > sammccall wrote:
> > > > ioeric wrote:
> > > > > Could you explain why `AllDeclsInMainFile` is necessary? I think it 
> > > > > might still be fine to boost score for symbols with a declaration in 
> > > > > the main file even if it has redecls in other files (e.g. function 
> > > > > fwd in headers).
> > > > Agree. I think the better signal is (any) decl in main file.
> > > My intuition was that it does not make any sense to functions if there 
> > > definitions are in the same cpp file, because it does not give any useful 
> > > signals on whether those should be preferably called in the same file.
> > > Also, some defs may not be visible to the compiler at the point of 
> > > completion and, therefore, won't get a boost, even if they are in the 
> > > same file. This is inconsistent.
> > > 
> > > E.g., 
> > > 
> > > ```
> > > // === foo.h
> > > class Foo {
> > >   int foo();
> > >   int bar();
> > >   int baz();
> > >   int test();
> > > };
> > > 
> > > int glob_foo();
> > > int glob_bar();
> > > int glob_baz();
> > > 
> > > // === foo.cpp
> > > #include "foo.h"
> > > static some_local_func() {}
> > > 
> > > Foo::foo() {
> > > };
> > > 
> > > int ::glob_foo() {
> > > }
> > > 
> > > Foo::test() {
> > >^
> > >// out of all functions declared in headers, only foo and global_foo 
> > > will get a boost here. That does not make much sense, since:
> > >// - glob_bar() and Foo::bar() are also declared in the same file, but 
> > > compiler hasn't seen them yet, so they won't get a boost.
> > >// - glob_baz() and Foo::baz() are not declared in the same file, but 
> > > they are so close to `bar` in the header, 
> > >//   and it doesn't seem to make sense to rank them differently.
> > > }
> > > 
> > > Foo::bar() {
> > > }
> > > 
> > > int ::glob_bar() {
> > > }
> > > ```
> > > 
> > > Why upranking decls **only** in the current file is still useful?
> > >  - Those are usually helpers that are very relevant to the file. 
> > > Especially true for small files.
> > >  - As a side-effect, we'll uprank local vars and parameters (they can't 
> > > have decls in other files :-)), that seems useful. Maybe such implicit 
> > > effects are not desirable, though.
> > > 
> > > I should've definitely documented this better. If we decide this change 
> > > is useful, I'll add more docs.
> > > My intuition was that it does not make any sense to functions if there 
> > > definitions are in the same cpp file, because it does not give any useful 
> > > signals on whether those should be preferably called in the same file.
> > Not sure if I understand this. Could you explain why?
> > 
> > > Also, some defs may not be visible to the compiler at the point of 
> > > completion and, therefore, won't get a boost, even if they are in the 
> > > same file. This is inconsistent.
> > I think this is somewhat expected as that symbol might not be visible to me 
> > e.g. a helper function defined after me.
> > 
> > > out of all functions declared in headers, only foo and global_foo will 
> > > get a boost here. That does not make much sense, since: 
> > I think you made a very good point here: for a .cc main file, symbols 
> > declared/defined in the main header (e.g. `x.h` for `x.cc`) are also 
> > interesting and should get a boost, so instead of only boosting symbols 
> > that are only declared in the main file, I would suggest also boost symbols 
> > in its corresponding header. We would need some heuristics to identify main 
> > header though (clang-format uses base file name sans extension which seems 
> > to work pretty well).
> > 
> > Thanks for the example! I think it's very useful for discussions.
> > 
> > 
> > 
> >> I think this is somewhat expected as that symbol might not be visible to 
> >> me e.g. a helper function defined after me.
> By "me", I mean I am a decl ;)
> 
> 
>> My intuition was that it does not make any sense to functions if there 
>> definitions are in the same cpp file, because it does not give any useful 
>> signals on whether those should be preferably called in the same file.
> Not sure if I understand this. Could you explain why?
If I'm implementing a class, relevance of all class methods for me is probably 
the same, no matter if I have a definition for some of those in the current 
file or not.
Probably the same point you make later about boosting symbols in the matching 
header.

> I think this is somewhat expected as that symbol might not be visible to me 
> e.g. a helper function defined after me.

A helper function defined later **only** in the current file is not visible in 
completion and this is (arguably) fine, since you can't refer to it anymore.

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Eric Liu via Phabricator via cfe-commits
ioeric added inline comments.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

ioeric wrote:
> ilya-biryukov wrote:
> > sammccall wrote:
> > > ioeric wrote:
> > > > Could you explain why `AllDeclsInMainFile` is necessary? I think it 
> > > > might still be fine to boost score for symbols with a declaration in 
> > > > the main file even if it has redecls in other files (e.g. function fwd 
> > > > in headers).
> > > Agree. I think the better signal is (any) decl in main file.
> > My intuition was that it does not make any sense to functions if there 
> > definitions are in the same cpp file, because it does not give any useful 
> > signals on whether those should be preferably called in the same file.
> > Also, some defs may not be visible to the compiler at the point of 
> > completion and, therefore, won't get a boost, even if they are in the same 
> > file. This is inconsistent.
> > 
> > E.g., 
> > 
> > ```
> > // === foo.h
> > class Foo {
> >   int foo();
> >   int bar();
> >   int baz();
> >   int test();
> > };
> > 
> > int glob_foo();
> > int glob_bar();
> > int glob_baz();
> > 
> > // === foo.cpp
> > #include "foo.h"
> > static some_local_func() {}
> > 
> > Foo::foo() {
> > };
> > 
> > int ::glob_foo() {
> > }
> > 
> > Foo::test() {
> >^
> >// out of all functions declared in headers, only foo and global_foo 
> > will get a boost here. That does not make much sense, since:
> >// - glob_bar() and Foo::bar() are also declared in the same file, but 
> > compiler hasn't seen them yet, so they won't get a boost.
> >// - glob_baz() and Foo::baz() are not declared in the same file, but 
> > they are so close to `bar` in the header, 
> >//   and it doesn't seem to make sense to rank them differently.
> > }
> > 
> > Foo::bar() {
> > }
> > 
> > int ::glob_bar() {
> > }
> > ```
> > 
> > Why upranking decls **only** in the current file is still useful?
> >  - Those are usually helpers that are very relevant to the file. Especially 
> > true for small files.
> >  - As a side-effect, we'll uprank local vars and parameters (they can't 
> > have decls in other files :-)), that seems useful. Maybe such implicit 
> > effects are not desirable, though.
> > 
> > I should've definitely documented this better. If we decide this change is 
> > useful, I'll add more docs.
> > My intuition was that it does not make any sense to functions if there 
> > definitions are in the same cpp file, because it does not give any useful 
> > signals on whether those should be preferably called in the same file.
> Not sure if I understand this. Could you explain why?
> 
> > Also, some defs may not be visible to the compiler at the point of 
> > completion and, therefore, won't get a boost, even if they are in the same 
> > file. This is inconsistent.
> I think this is somewhat expected as that symbol might not be visible to me 
> e.g. a helper function defined after me.
> 
> > out of all functions declared in headers, only foo and global_foo will get 
> > a boost here. That does not make much sense, since: 
> I think you made a very good point here: for a .cc main file, symbols 
> declared/defined in the main header (e.g. `x.h` for `x.cc`) are also 
> interesting and should get a boost, so instead of only boosting symbols that 
> are only declared in the main file, I would suggest also boost symbols in its 
> corresponding header. We would need some heuristics to identify main header 
> though (clang-format uses base file name sans extension which seems to work 
> pretty well).
> 
> Thanks for the example! I think it's very useful for discussions.
> 
> 
> 
>> I think this is somewhat expected as that symbol might not be visible to me 
>> e.g. a helper function defined after me.
By "me", I mean I am a decl ;)




Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Eric Liu via Phabricator via cfe-commits
ioeric added inline comments.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

ilya-biryukov wrote:
> sammccall wrote:
> > ioeric wrote:
> > > Could you explain why `AllDeclsInMainFile` is necessary? I think it might 
> > > still be fine to boost score for symbols with a declaration in the main 
> > > file even if it has redecls in other files (e.g. function fwd in headers).
> > Agree. I think the better signal is (any) decl in main file.
> My intuition was that it does not make any sense to functions if there 
> definitions are in the same cpp file, because it does not give any useful 
> signals on whether those should be preferably called in the same file.
> Also, some defs may not be visible to the compiler at the point of completion 
> and, therefore, won't get a boost, even if they are in the same file. This is 
> inconsistent.
> 
> E.g., 
> 
> ```
> // === foo.h
> class Foo {
>   int foo();
>   int bar();
>   int baz();
>   int test();
> };
> 
> int glob_foo();
> int glob_bar();
> int glob_baz();
> 
> // === foo.cpp
> #include "foo.h"
> static some_local_func() {}
> 
> Foo::foo() {
> };
> 
> int ::glob_foo() {
> }
> 
> Foo::test() {
>^
>// out of all functions declared in headers, only foo and global_foo will 
> get a boost here. That does not make much sense, since:
>// - glob_bar() and Foo::bar() are also declared in the same file, but 
> compiler hasn't seen them yet, so they won't get a boost.
>// - glob_baz() and Foo::baz() are not declared in the same file, but they 
> are so close to `bar` in the header, 
>//   and it doesn't seem to make sense to rank them differently.
> }
> 
> Foo::bar() {
> }
> 
> int ::glob_bar() {
> }
> ```
> 
> Why upranking decls **only** in the current file is still useful?
>  - Those are usually helpers that are very relevant to the file. Especially 
> true for small files.
>  - As a side-effect, we'll uprank local vars and parameters (they can't have 
> decls in other files :-)), that seems useful. Maybe such implicit effects are 
> not desirable, though.
> 
> I should've definitely documented this better. If we decide this change is 
> useful, I'll add more docs.
> My intuition was that it does not make any sense to functions if there 
> definitions are in the same cpp file, because it does not give any useful 
> signals on whether those should be preferably called in the same file.
Not sure if I understand this. Could you explain why?

> Also, some defs may not be visible to the compiler at the point of completion 
> and, therefore, won't get a boost, even if they are in the same file. This is 
> inconsistent.
I think this is somewhat expected as that symbol might not be visible to me 
e.g. a helper function defined after me.

> out of all functions declared in headers, only foo and global_foo will get a 
> boost here. That does not make much sense, since: 
I think you made a very good point here: for a .cc main file, symbols 
declared/defined in the main header (e.g. `x.h` for `x.cc`) are also 
interesting and should get a boost, so instead of only boosting symbols that 
are only declared in the main file, I would suggest also boost symbols in its 
corresponding header. We would need some heuristics to identify main header 
though (clang-format uses base file name sans extension which seems to work 
pretty well).

Thanks for the example! I think it's very useful for discussions.





Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added inline comments.



Comment at: clangd/AST.h:32
+/// Returns true iff all redecls of \p D are in the main file.
+bool allDeclsInMainFile(const Decl *D);
 

sammccall wrote:
> Do you expect this to be reused?
> If so, unit test? Otherwise this seems small enough to move where it's used.
I'd say we could reuse it, yes. Even though the function is pretty 
straightforward, there are details that should be consistent across the whole 
codebase (i.e. which locations to check, spelling, expansion?)
I'll add unittests.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov updated this revision to Diff 147301.
ilya-biryukov added a comment.

- Move tests to QualityTests.cpp
- Fix findDecl() assertion on redecls of the same thing
- Fix file comment of TestTU.cpp


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/AST.cpp
  clangd/AST.h
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/QualityTests.cpp
  unittests/clangd/TestTU.cpp

Index: unittests/clangd/TestTU.cpp
===
--- unittests/clangd/TestTU.cpp
+++ unittests/clangd/TestTU.cpp
@@ -1,5 +1,4 @@
-//===--- TestTU.cpp - Scratch source files for testing *-
-//C++-*-===//
+//===--- TestTU.cpp - Scratch source files for testing ---===//
 //
 // The LLVM Compiler Infrastructure
 //
@@ -78,11 +77,11 @@
 auto *ND = dyn_cast(D);
 if (!ND || ND->getNameAsString() != QName)
   continue;
-if (Result) {
+if (Result && ND->getCanonicalDecl() != Result) {
   ADD_FAILURE() << "Multiple Decls named " << QName;
   assert(false && "QName is not unique");
 }
-Result = ND;
+Result = cast(ND->getCanonicalDecl());
   }
   if (!Result) {
 ADD_FAILURE() << "No Decl named " << QName << " in AST";
Index: unittests/clangd/QualityTests.cpp
===
--- unittests/clangd/QualityTests.cpp
+++ unittests/clangd/QualityTests.cpp
@@ -118,6 +118,45 @@
   EXPECT_LT(sortText(0, "a"), sortText(0, "z"));
 }
 
+TEST(QualityTests, BoostCurrentFileDecls) {
+  TestTU Test;
+  Test.HeaderFilename = "foo.h";
+  Test.HeaderCode = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+  Test.Code = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+
+int test() {
+  test_func_^
+}
+  )cpp";
+
+  ParsedAST AST = Test.build();
+
+  SymbolQualitySignals FuncInCpp;
+  FuncInCpp.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_cpp"),
+   CCP_Declaration));
+  EXPECT_TRUE(FuncInCpp.AllDeclsInMainFile);
+
+  SymbolQualitySignals FuncInHeader;
+  FuncInHeader.merge(CodeCompletionResult(&findDecl(AST, "test_func_in_header"),
+  CCP_Declaration));
+  EXPECT_FALSE(FuncInHeader.AllDeclsInMainFile);
+
+  SymbolQualitySignals FuncInHeaderAndCpp;
+  FuncInHeaderAndCpp.merge(CodeCompletionResult(
+  &findDecl(AST, "test_func_in_header_and_cpp"), CCP_Declaration));
+  EXPECT_FALSE(FuncInHeaderAndCpp.AllDeclsInMainFile);
+
+  EXPECT_LE(FuncInHeader.evaluate(), FuncInCpp.evaluate());
+  EXPECT_FLOAT_EQ(FuncInHeader.evaluate(), FuncInHeaderAndCpp.evaluate());
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
Index: clangd/Quality.h
===
--- clangd/Quality.h
+++ clangd/Quality.h
@@ -47,6 +47,7 @@
   unsigned SemaCCPriority = 0; // 1-80, 1 is best. 0 means absent.
// FIXME: this is actually a mix of symbol
//quality and relevance. Untangle this.
+  bool AllDeclsInMainFile = false;
   bool Deprecated = false;
   unsigned References = 0;
 
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -7,6 +7,7 @@
 //
 //===-===//
 #include "Quality.h"
+#include "AST.h"
 #include "index/Index.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/Support/FormatVariadic.h"
@@ -19,7 +20,8 @@
 
 void SymbolQualitySignals::merge(const CodeCompletionResult &SemaCCResult) {
   SemaCCPriority = SemaCCResult.Priority;
-
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)
 Deprecated = true;
 }
@@ -41,6 +43,10 @@
 // Priority 80 is a really bad score.
 Score *= 2 - std::min(80, SemaCCPriority) / 40;
 
+  // Things declared in the main file get a large boost.
+  if (AllDeclsInMainFile)
+Score *= 2;
+
   if (Deprecated)
 Score *= 0.1;
 
Index: clangd/AST.h
===
--- clangd/AST.h
+++ clangd/AST.h
@@ -26,7 +26,10 @@
 ///
 /// The returned location is usually the spelling location where the name of the
 /// decl occurs in the code.
-SourceLocation findNameLoc(const clang::Decl* D);
+SourceLocation findNameLoc(const Decl *D);
+
+/// Returns true iff all redecls of \p D are in the main file.
+bool allDeclsInMainFile(const Decl *D);
 
 } // namespace clangd
 } // namespace clang
Index: clangd/AST.cpp
===
--- clangd/AST.cpp
+++ clangd/AST.cpp
@@ -17,8 +17,8 @@
 namespace clangd {
 using names

[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-17 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov added inline comments.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

sammccall wrote:
> ioeric wrote:
> > Could you explain why `AllDeclsInMainFile` is necessary? I think it might 
> > still be fine to boost score for symbols with a declaration in the main 
> > file even if it has redecls in other files (e.g. function fwd in headers).
> Agree. I think the better signal is (any) decl in main file.
My intuition was that it does not make any sense to functions if there 
definitions are in the same cpp file, because it does not give any useful 
signals on whether those should be preferably called in the same file.
Also, some defs may not be visible to the compiler at the point of completion 
and, therefore, won't get a boost, even if they are in the same file. This is 
inconsistent.

E.g., 

```
// === foo.h
class Foo {
  int foo();
  int bar();
  int baz();
  int test();
};

int glob_foo();
int glob_bar();
int glob_baz();

// === foo.cpp
#include "foo.h"
static some_local_func() {}

Foo::foo() {
};

int ::glob_foo() {
}

Foo::test() {
   ^
   // out of all functions declared in headers, only foo and global_foo will 
get a boost here. That does not make much sense, since:
   // - glob_bar() and Foo::bar() are also declared in the same file, but 
compiler hasn't seen them yet, so they won't get a boost.
   // - glob_baz() and Foo::baz() are not declared in the same file, but they 
are so close to `bar` in the header, 
   //   and it doesn't seem to make sense to rank them differently.
}

Foo::bar() {
}

int ::glob_bar() {
}
```

Why upranking decls **only** in the current file is still useful?
 - Those are usually helpers that are very relevant to the file. Especially 
true for small files.
 - As a side-effect, we'll uprank local vars and parameters (they can't have 
decls in other files :-)), that seems useful. Maybe such implicit effects are 
not desirable, though.

I should've definitely documented this better. If we decide this change is 
useful, I'll add more docs.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-16 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: clangd/AST.h:32
+/// Returns true iff all redecls of \p D are in the main file.
+bool allDeclsInMainFile(const Decl *D);
 

Do you expect this to be reused?
If so, unit test? Otherwise this seems small enough to move where it's used.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

ioeric wrote:
> Could you explain why `AllDeclsInMainFile` is necessary? I think it might 
> still be fine to boost score for symbols with a declaration in the main file 
> even if it has redecls in other files (e.g. function fwd in headers).
Agree. I think the better signal is (any) decl in main file.



Comment at: unittests/clangd/CodeCompleteTests.cpp:965
 
+TEST(CompletionTest, BoostCurrentFileDecls) {
+  MockFSProvider FS;

This should really be unit tested in QualityTests. I think instead rather than 
in addition - I'd like to get away from code completion ranking tests for every 
quality signal, it's really indirect and hard to maintain/review.



Comment at: unittests/clangd/CodeCompleteTests.cpp:966
+TEST(CompletionTest, BoostCurrentFileDecls) {
+  MockFSProvider FS;
+  FS.Files[testPath("foo.h")] = R"cpp(

(in QualityTests, we should be able to continue using TestTU)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-16 Thread Eric Liu via Phabricator via cfe-commits
ioeric added inline comments.



Comment at: clangd/Quality.cpp:24
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)

Could you explain why `AllDeclsInMainFile` is necessary? I think it might still 
be fine to boost score for symbols with a declaration in the main file even if 
it has redecls in other files (e.g. function fwd in headers).


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

2018-05-16 Thread Ilya Biryukov via Phabricator via cfe-commits
ilya-biryukov created this revision.
ilya-biryukov added reviewers: ioeric, sammccall.
Herald added subscribers: mgrang, jkorous, MaskRay, klimek.

This should, arguably, give better ranking.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D46943

Files:
  clangd/AST.cpp
  clangd/AST.h
  clangd/Quality.cpp
  clangd/Quality.h
  unittests/clangd/CodeCompleteTests.cpp

Index: unittests/clangd/CodeCompleteTests.cpp
===
--- unittests/clangd/CodeCompleteTests.cpp
+++ unittests/clangd/CodeCompleteTests.cpp
@@ -962,6 +962,48 @@
   }
 }
 
+TEST(CompletionTest, BoostCurrentFileDecls) {
+  MockFSProvider FS;
+  FS.Files[testPath("foo.h")] = R"cpp(
+int test_func_in_header();
+int test_func_in_header_and_cpp();
+)cpp";
+
+  MockCompilationDatabase CDB;
+  IgnoreDiagnostics DiagConsumer;
+  ClangdServer Server(CDB, FS, DiagConsumer, ClangdServer::optsForTest());
+  llvm::StringLiteral Text = R"cpp(
+#include "foo.h"
+int ::test_func_in_header_and_cpp() {
+}
+int test_func_in_cpp();
+
+int test() {
+  test_func_^
+}
+  )cpp";
+
+  auto Results =
+  completions(Server, Text, {}, clangd::CodeCompleteOptions()).items;
+  std::sort(Results.begin(), Results.end(),
+[](const CompletionItem &L, const CompletionItem &R) {
+  return L.sortText < R.sortText;
+});
+  ASSERT_THAT(Results,
+  UnorderedElementsAre(Named("test_func_in_cpp"),
+   Named("test_func_in_header"),
+   Named("test_func_in_header_and_cpp")));
+
+  auto &FuncInCpp = Results[0];
+  auto &FuncInHeader = Results[1];
+  auto &FuncInHeaderAndCpp = Results[2];
+
+  EXPECT_LE(FuncInHeader.scoreInfo->finalScore,
+FuncInCpp.scoreInfo->finalScore);
+  EXPECT_FLOAT_EQ(FuncInHeader.scoreInfo->finalScore,
+  FuncInHeaderAndCpp.scoreInfo->finalScore);
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
Index: clangd/Quality.h
===
--- clangd/Quality.h
+++ clangd/Quality.h
@@ -47,6 +47,7 @@
   unsigned SemaCCPriority = 0; // 1-80, 1 is best. 0 means absent.
// FIXME: this is actually a mix of symbol
//quality and relevance. Untangle this.
+  bool AllDeclsInMainFile = false;
   bool Deprecated = false;
   unsigned References = 0;
 
Index: clangd/Quality.cpp
===
--- clangd/Quality.cpp
+++ clangd/Quality.cpp
@@ -7,6 +7,7 @@
 //
 //===-===//
 #include "Quality.h"
+#include "AST.h"
 #include "index/Index.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/Support/FormatVariadic.h"
@@ -19,7 +20,8 @@
 
 void SymbolQualitySignals::merge(const CodeCompletionResult &SemaCCResult) {
   SemaCCPriority = SemaCCResult.Priority;
-
+  if (SemaCCResult.Declaration)
+AllDeclsInMainFile = allDeclsInMainFile(SemaCCResult.Declaration);
   if (SemaCCResult.Availability == CXAvailability_Deprecated)
 Deprecated = true;
 }
@@ -41,6 +43,10 @@
 // Priority 80 is a really bad score.
 Score *= 2 - std::min(80, SemaCCPriority) / 40;
 
+  // Things declared in the main file get a large boost.
+  if (AllDeclsInMainFile)
+Score *= 2;
+
   if (Deprecated)
 Score *= 0.1;
 
Index: clangd/AST.h
===
--- clangd/AST.h
+++ clangd/AST.h
@@ -26,7 +26,10 @@
 ///
 /// The returned location is usually the spelling location where the name of the
 /// decl occurs in the code.
-SourceLocation findNameLoc(const clang::Decl* D);
+SourceLocation findNameLoc(const Decl *D);
+
+/// Returns true iff all redecls of \p D are in the main file.
+bool allDeclsInMainFile(const Decl *D);
 
 } // namespace clangd
 } // namespace clang
Index: clangd/AST.cpp
===
--- clangd/AST.cpp
+++ clangd/AST.cpp
@@ -17,8 +17,8 @@
 namespace clangd {
 using namespace llvm;
 
-SourceLocation findNameLoc(const clang::Decl* D) {
-  const auto& SM = D->getASTContext().getSourceManager();
+SourceLocation findNameLoc(const Decl *D) {
+  const auto &SM = D->getASTContext().getSourceManager();
   // FIXME: Revisit the strategy, the heuristic is limitted when handling
   // macros, we should use the location where the whole definition occurs.
   SourceLocation SpellingLoc = SM.getSpellingLoc(D->getLocation());
@@ -38,5 +38,15 @@
   return SpellingLoc;
 }
 
+bool allDeclsInMainFile(const Decl *D) {
+  const SourceManager &SM = D->getASTContext().getSourceManager();
+  for (auto *Redecl : D->redecls()) {
+auto Loc = Redecl->getLocation();
+if (!SM.isWrittenInMainFile(Loc))
+  return false;
+  }
+  return true;
+}
+
 } // namespace clangd
 } // namespace