[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-13 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui created this revision.
kamleshbhalui added reviewers: probinson, dblaikie, vsk, labath.
kamleshbhalui added a project: debug-info.
Herald added subscribers: cfe-commits, aprantl.
Herald added a project: clang.

strip/remove the dot slash before creating files for debug info.
This fixes https://bugs.llvm.org/show_bug.cgi?id=44170

I am unable to add a test for this.
since it requires to pass command like this.
$clang ./t.c -g -emit-llvm -S


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D71508

Files:
  clang/lib/CodeGen/CGDebugInfo.cpp


Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -409,6 +409,10 @@
 // If the location is not valid then use main input file.
 return TheCU->getFile();
 
+  if (!llvm::sys::path::is_absolute(FileName)) {
+  FileName = llvm::sys::path::remove_leading_dotslash(FileName);
+}
+
   // Cache the results.
   auto It = DIFileCache.find(FileName.data());
   if (It != DIFileCache.end()) {


Index: clang/lib/CodeGen/CGDebugInfo.cpp
===
--- clang/lib/CodeGen/CGDebugInfo.cpp
+++ clang/lib/CodeGen/CGDebugInfo.cpp
@@ -409,6 +409,10 @@
 // If the location is not valid then use main input file.
 return TheCU->getFile();
 
+  if (!llvm::sys::path::is_absolute(FileName)) {
+  FileName = llvm::sys::path::remove_leading_dotslash(FileName);
+}
+
   // Cache the results.
   auto It = DIFileCache.find(FileName.data());
   if (It != DIFileCache.end()) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-14 Thread David Stenberg via Phabricator via cfe-commits
dstenb added a comment.

It should be possible to create a test for this using something like:

  RUN: rm -rf %t && mkdir %t
  RUN: cp %s %t/filename.c
  RUN: cd %t
  RUN: clang ./filename.c [...]




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread Paul Robinson via Phabricator via cfe-commits
probinson added a comment.

Do we have a similar problem if the filespec has an embedded ./ or ../ in it?  
I'm thinking some broader canonicalization ought to be done here.
$ clang ./dir1/dir2/../dir3/file.c
should resolve to dir1/dir3/file.c shouldn't it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui added a comment.

In D71508#1785767 , @probinson wrote:

> Do we have a similar problem if the filespec has an embedded ./ or ../ in it?


problems  occur only when filespec starts with ./ otherwise it's fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread Paul Robinson via Phabricator via cfe-commits
probinson added a comment.

In D71508#1786122 , @kamleshbhalui 
wrote:

> In D71508#1785767 , @probinson wrote:
>
> > Do we have a similar problem if the filespec has an embedded ./ or ../ in 
> > it?
>
>
> problems  occur only when filespec starts with ./ otherwise it's fine.


So do other kinds of paths get canonicalized elsewhere?  I'm thinking if 
there's already a place that undoes embedded .. then it should handle leading . 
as well.

Is a leading .. okay or a problem?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui added a comment.

In D71508#1786148 , @probinson wrote:

> In D71508#1786122 , @kamleshbhalui 
> wrote:
>
> > In D71508#1785767 , @probinson 
> > wrote:
> >
> > > Do we have a similar problem if the filespec has an embedded ./ or ../ in 
> > > it?
> >
> >
> > problems  occur only when filespec starts with ./ otherwise it's fine.
>
>
> So do other kinds of paths get canonicalized elsewhere?  I'm thinking if 
> there's already a place that undoes embedded .. then it should handle leading 
> . as well.


other canonicalization happens here 
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L541
and 
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L441

but that does not undo any embedded . and .. anywhere,except leading ./

i.e. when we pass like
$clang .././p1.c -S -emit-llvm -g

IRGen gives single file entry like this
!1 = !DIFile(filename: ".././p1.c"

As you can see path is  not canonicalized but it atleast does not have 
duplicate file entry.

> Is a leading .. okay or a problem?

yes It's ok in the sense ,it does not create duplicate file entry in debug info.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread Adrian Prantl via Phabricator via cfe-commits
aprantl added a comment.

Are we sure we want to canonicalize *before* applying -fdebug-prefix-map in 
`remapDIPath()`? Honest question, the answer could be yes :-)




Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:412
 
+  if (!llvm::sys::path::is_absolute(FileName)) {
+  FileName = llvm::sys::path::remove_leading_dotslash(FileName);

I believe that this check is redundant and also part of remove_leading_dotslash?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

In D71508#1786212 , @kamleshbhalui 
wrote:

> In D71508#1786148 , @probinson wrote:
>
> > In D71508#1786122 , @kamleshbhalui 
> > wrote:
> >
> > > In D71508#1785767 , @probinson 
> > > wrote:
> > >
> > > > Do we have a similar problem if the filespec has an embedded ./ or ../ 
> > > > in it?
> > >
> > >
> > > problems  occur only when filespec starts with ./ otherwise it's fine.
> >
> >
> > So do other kinds of paths get canonicalized elsewhere?  I'm thinking if 
> > there's already a place that undoes embedded .. then it should handle 
> > leading . as well.
>
>
> other canonicalization happens here 
>  
> https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L541
>  and 
>  
> https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L441
>
> but that does not undo any embedded . and .. anywhere,except leading ./
>
> i.e. when we pass like
>  $clang .././p1.c -S -emit-llvm -g
>
> IRGen gives single file entry like this
>  !1 = !DIFile(filename: ".././p1.c"
>
> As you can see path is  not canonicalized but it atleast does not have 
> duplicate file entry.


Even if that .c file is referred to by a different path - what about a #include 
of that .c file (with some #ifdefs to avoid infinite include recursion) by the 
absolute path, for instance? does that still not end up with duplicate files 
with two different paths to the same file?

> 
> 
>> Is a leading .. okay or a problem?
> 
> yes It's ok in the sense ,it does not create duplicate file entry in debug 
> info.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui added a comment.

In D71508#1786804 , @dblaikie wrote:

> In D71508#1786212 , @kamleshbhalui 
> wrote:
>
> > In D71508#1786148 , @probinson 
> > wrote:
> >
> > > In D71508#1786122 , 
> > > @kamleshbhalui wrote:
> > >
> > > > In D71508#1785767 , @probinson 
> > > > wrote:
> > > >
> > > > > Do we have a similar problem if the filespec has an embedded ./ or 
> > > > > ../ in it?
> > > >
> > > >
> > > > problems  occur only when filespec starts with ./ otherwise it's fine.
> > >
> > >
> > > So do other kinds of paths get canonicalized elsewhere?  I'm thinking if 
> > > there's already a place that undoes embedded .. then it should handle 
> > > leading . as well.
> >
> >
> > other canonicalization happens here 
> >  
> > https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L541
> >  and 
> >  
> > https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L441
> >
> > but that does not undo any embedded . and .. anywhere,except leading ./
> >
> > i.e. when we pass like
> >  $clang .././p1.c -S -emit-llvm -g
> >
> > IRGen gives single file entry like this
> >  !1 = !DIFile(filename: ".././p1.c"
> >
> > As you can see path is  not canonicalized but it atleast does not have 
> > duplicate file entry.
>
>
> Even if that .c file is referred to by a different path - what about a 
> #include of that .c file (with some #ifdefs to avoid infinite include 
> recursion) by the absolute path, for instance? does that still not end up 
> with duplicate files with two different paths to the same file?


No, that does not happen. Because in that case we create file entry for first 
time with absolute path and later we always get the same file with 
getOrCreateFile.

The duplicate File entry only happens with the file passed by driver 
-main-file-name p1.cc which used  in creating compilation unit. 
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L518

and while creating debuginfo for variable / function it refers to file by 
quering getOrCreateFile with relative path specified on commmand line and it 
sees that there is no such entry and ends up creating new file entry with this 
path hence duplication.
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L406

> 
> 
>> 
>> 
>>> Is a leading .. okay or a problem?
>> 
>> yes It's ok in the sense ,it does not create duplicate file entry in debug 
>> info.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-16 Thread kamlesh kumar via Phabricator via cfe-commits
kamleshbhalui added a comment.

In D71508#1786799 , @aprantl wrote:

> Are we sure we want to canonicalize *before* applying -fdebug-prefix-map in 
> `remapDIPath()`? Honest question, the answer could be yes :-)


it canonicalizes before apply -fdebug-prefix-map in `remapDIPath()`  only when 
creating compilation unit with filename passed by driver  option -main-file-name
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L615

Other than that it first apply -fdebug-prefix-map in `remapDIPath()` then try 
to canonicalize.
https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CGDebugInfo.cpp#L441


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71508: [DebugInfo] Duplicate file names in debug info

2019-12-17 Thread Pavel Labath via Phabricator via cfe-commits
labath added a comment.

In D71508#1785767 , @probinson wrote:

> Do we have a similar problem if the filespec has an embedded ./ or ../ in it? 
>  I'm thinking some broader canonicalization ought to be done here.
>  $ clang ./dir1/dir2/../dir3/file.c
>  should resolve to dir1/dir3/file.c shouldn't it?


I would be very careful about aggressive canonicalization. Once you throw 
symlinks into the mix, there's very little things you can safely do. If `dir2` 
is a symlink then `dir2/..` can point pretty much anywhere. And if you use 
something like `realpath` there's no telling whether the final result actually 
be the thing you actually want to put into the debug info (your whole source 
tree could be a "symlink farm" with individual files pointing to random 
unpredictable locations).

It seems to me that the underlying problem here is that there are different 
kinds of canonicalization applied to the "main" and other files, which is 
something that the comment in CDDebugInfo::CreateCompileUnit seems to 
acknowledge. However, I don't know anything about this code to say how/if it is 
possible to fix that...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71508/new/

https://reviews.llvm.org/D71508



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits