I agree that code points are the right thing to use (at least for now). There
is one key advantage:
As we are only breaking strings, not joining strings, clang-format will
rarely do the wrong thing with correctly formatted code. Currently, if we
encounter a unicode character, we end up breaking the string too early. This
affects basically any long/multiline comment or string. With this patch,
authors using double-width characters won't feel the joy of clang-format
automatically breaking up their strings (in the right place), but once they
have manually broken them, clang-format will still do the right thing. To be
fair, clang-format might still do the wrong thing in situations like:
SomeFunction("string with double-width characters would bring this close to
column 80", AnotherParameter);
However, I suspect those to be quite rare. And I agree with James that this
might be a dangerous road to follow. After all, double-width characters are not
always double-width. I have seen font-renderers using 1.5 columns, and then
what?
================
Comment at: lib/Format/Utils.h:1
@@ +1,2 @@
+//===--- Utils.h - Format C++ code
----------------------------------------===//
+//
----------------
Please don't call this "Utils", this is far too generic. How about "Encodings"?
I think hex/octal escape sequences are also a kind of encoding ..
================
Comment at: unittests/Format/FormatTest.cpp:4931
@@ +4930,3 @@
+
+TEST_F(FormatTest, SplitsUTF8BlockComments) {
+ EXPECT_EQ("/* Гляжу,\n"
----------------
If I am correct, the chinese letters are just numbers, I hope the russian
characters don't mean anything offensive ;-)...
================
Comment at: lib/Format/FormatToken.h:96
@@ -94,3 +95,3 @@
/// with the token.
unsigned TokenLength;
----------------
How about we make these slightly easier to understand and shorter?
What are the remaining usages of TokenLength? Would it make sense to rename
that to "ByteCount"? And would it then make sense to rename CodePointCount to
"TokenLength"? Or even just "Length" as we are in a class ..Token?
http://llvm-reviews.chandlerc.com/D918
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits