[PATCH] D66511: [clang-scan-deps] Skip UTF-8 BOM in source minimizer

2019-08-26 Thread Alexandre Ganea via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL369993: [clang-scan-deps] Skip UTF-8 BOM in source minimizer 
(authored by aganea, committed by ).
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D66511?vs=216311=217275#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66511/new/

https://reviews.llvm.org/D66511

Files:
  cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
  cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c


Index: cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
===
--- cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
+++ cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
@@ -834,7 +834,14 @@
   return lexDefault(Kind, Id.Name, First, End);
 }
 
+static void skipUTF8ByteOrderMark(const char *, const char *const End) {
+  if ((End - First) >= 3 && First[0] == '\xef' && First[1] == '\xbb' &&
+  First[2] == '\xbf')
+First += 3;
+}
+
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)
 if (lexPPLine(First, End))
   return true;
Index: cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
===
--- cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
+++ cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
@@ -0,0 +1,10 @@
+// Test UTF8 BOM at start of file
+// RUN: printf '\xef\xbb\xbf' > %t.c
+// RUN: echo '#ifdef TEST\n' >> %t.c
+// RUN: echo '#include ' >> %t.c
+// RUN: echo '#endif' >> %t.c
+// RUN: %clang_cc1 -DTEST -print-dependency-directives-minimized-source %t.c 
2>&1 | FileCheck %s
+
+// CHECK:  #ifdef TEST
+// CHECK-NEXT: #include 
+// CHECK-NEXT: #endif


Index: cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
===
--- cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
+++ cfe/trunk/lib/Lex/DependencyDirectivesSourceMinimizer.cpp
@@ -834,7 +834,14 @@
   return lexDefault(Kind, Id.Name, First, End);
 }
 
+static void skipUTF8ByteOrderMark(const char *, const char *const End) {
+  if ((End - First) >= 3 && First[0] == '\xef' && First[1] == '\xbb' &&
+  First[2] == '\xbf')
+First += 3;
+}
+
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)
 if (lexPPLine(First, End))
   return true;
Index: cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
===
--- cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
+++ cfe/trunk/test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
@@ -0,0 +1,10 @@
+// Test UTF8 BOM at start of file
+// RUN: printf '\xef\xbb\xbf' > %t.c
+// RUN: echo '#ifdef TEST\n' >> %t.c
+// RUN: echo '#include ' >> %t.c
+// RUN: echo '#endif' >> %t.c
+// RUN: %clang_cc1 -DTEST -print-dependency-directives-minimized-source %t.c 2>&1 | FileCheck %s
+
+// CHECK:  #ifdef TEST
+// CHECK-NEXT: #include 
+// CHECK-NEXT: #endif
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66511: [clang-scan-deps] Skip UTF-8 BOM in source minimizer

2019-08-24 Thread Alexandre Ganea via Phabricator via cfe-commits
aganea marked an inline comment as done.
aganea added inline comments.



Comment at: lib/Lex/DependencyDirectivesSourceMinimizer.cpp:822
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)

dexonsmith wrote:
> Is skipping this the right thing, or should it also be copied to the output?
The code in `Lexer::InitLexer()` assumes the files are always encoded as UTF-8, 
it simply skips over the BOM like we do here.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66511/new/

https://reviews.llvm.org/D66511



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66511: [clang-scan-deps] Skip UTF-8 BOM in source minimizer

2019-08-23 Thread Duncan P. N. Exon Smith via Phabricator via cfe-commits
dexonsmith added inline comments.



Comment at: lib/Lex/DependencyDirectivesSourceMinimizer.cpp:822
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)

Is skipping this the right thing, or should it also be copied to the output?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66511/new/

https://reviews.llvm.org/D66511



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66511: [clang-scan-deps] Skip UTF-8 BOM in source minimizer

2019-08-23 Thread Alex Lorenz via Phabricator via cfe-commits
arphaman accepted this revision.
arphaman added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66511/new/

https://reviews.llvm.org/D66511



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D66511: [clang-scan-deps] Skip UTF-8 BOM in source minimizer

2019-08-20 Thread Alexandre Ganea via Phabricator via cfe-commits
aganea created this revision.
aganea added reviewers: arphaman, dexonsmith, Bigcheese.
aganea added a project: clang.
Herald added a subscriber: tschuett.

As per title.


Repository:
  rC Clang

https://reviews.llvm.org/D66511

Files:
  lib/Lex/DependencyDirectivesSourceMinimizer.cpp
  test/Lexer/minimize_source_to_dependency_directives_utf8bom.c


Index: test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
===
--- test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
+++ test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
@@ -0,0 +1,10 @@
+// Test UTF8 BOM at start of file
+// RUN: printf '\xef\xbb\xbf' > %t.c
+// RUN: echo '#ifdef TEST\n' >> %t.c
+// RUN: echo '#include ' >> %t.c
+// RUN: echo '#endif' >> %t.c
+// RUN: %clang_cc1 -DTEST -print-dependency-directives-minimized-source %t.c 
2>&1 | FileCheck %s
+
+// CHECK:  #ifdef TEST
+// CHECK-NEXT: #include 
+// CHECK-NEXT: #endif
Index: lib/Lex/DependencyDirectivesSourceMinimizer.cpp
===
--- lib/Lex/DependencyDirectivesSourceMinimizer.cpp
+++ lib/Lex/DependencyDirectivesSourceMinimizer.cpp
@@ -812,7 +812,14 @@
   return lexDefault(Kind, Id.Name, First, End);
 }
 
+static void skipUTF8ByteOrderMark(const char *, const char *const End) {
+  if ((End - First) >= 3 && First[0] == '\xef' && First[1] == '\xbb' &&
+  First[2] == '\xbf')
+First += 3;
+}
+
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)
 if (lexPPLine(First, End))
   return true;


Index: test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
===
--- test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
+++ test/Lexer/minimize_source_to_dependency_directives_utf8bom.c
@@ -0,0 +1,10 @@
+// Test UTF8 BOM at start of file
+// RUN: printf '\xef\xbb\xbf' > %t.c
+// RUN: echo '#ifdef TEST\n' >> %t.c
+// RUN: echo '#include ' >> %t.c
+// RUN: echo '#endif' >> %t.c
+// RUN: %clang_cc1 -DTEST -print-dependency-directives-minimized-source %t.c 2>&1 | FileCheck %s
+
+// CHECK:  #ifdef TEST
+// CHECK-NEXT: #include 
+// CHECK-NEXT: #endif
Index: lib/Lex/DependencyDirectivesSourceMinimizer.cpp
===
--- lib/Lex/DependencyDirectivesSourceMinimizer.cpp
+++ lib/Lex/DependencyDirectivesSourceMinimizer.cpp
@@ -812,7 +812,14 @@
   return lexDefault(Kind, Id.Name, First, End);
 }
 
+static void skipUTF8ByteOrderMark(const char *, const char *const End) {
+  if ((End - First) >= 3 && First[0] == '\xef' && First[1] == '\xbb' &&
+  First[2] == '\xbf')
+First += 3;
+}
+
 bool Minimizer::minimizeImpl(const char *First, const char *const End) {
+  skipUTF8ByteOrderMark(First, End);
   while (First != End)
 if (lexPPLine(First, End))
   return true;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits