> On Jul 31, 2014, at 10:18 AM, Zachary Turner <[email protected]> wrote:
> 
> Just to expand a little more since my last post, the PathSyntax to the 
> constructor and to SetFile() has a default value, which will do the right 
> thing 99% of the time by selecting Host syntax.  You only need to specify it 
> if you know what you're doing, so to speak.  But without being able to set 
> it, there would be no way to have FileSpec represent a path on a remote 
> machine.

So Jim changed my mind after speaking with him about specifying the PathSyntax, 
so I agree that is the right way to go.

> 
> As for the performance cost of normalization, SmallString<64> only does a 
> malloc if the string is longer than 64 characters.  I could try increasing 
> the number, I think 128 would cover probably 95% of all file paths.  I agree 
> in principle that it would be nice to not normalize the string on every file 
> set, one idea would be to never normalize it if PathSyntax = Posix (since 
> Normalized form is by definition Posix form) and always normalize it 
> otherwise.

I still think we should avoid making the copy if it is already normalized. I 
would be fine if posix never normalized but it does for other syntaxes. 

>  But then to avoid the copy I would need to change the SetFile()'s signature 
> to accept a SmallString instead of a const char*.  

Why can't Normalize and Denormalize return a bool: false if no normalization 
was a needed, and true if it was normalized. Then just have FileSpec::SetFile() 
use a llvm::StringRef after the call to normalize which will contain the raw 
"pathname" if no normalization was needed or the contents of "normalized" if it 
was normalized.


> I actually thinking changing raw strings to LLVM strings is a good change, 
> but it's probably going to result in some unrelated churn fixing up function 
> signatures elsewhere, so doing that iteratively as a separate change might 
> make more sense.

Feel free to make two FileSpec::SetFile() functions:

void
FileSpec::SetFile (const char *path_cstr);

void
FileSpec::SetFile (const llvm::StringRef &path);

Then the const char * version would call the llvm::StringRef version. We don't 
to allow just the llvm::StringRef version because if you pass in NULL StringRef 
will crash with an assert.

> Thoughts?

Let me know what you think about the above comments.

Greg

> 
> 
> On Wed, Jul 30, 2014 at 6:50 PM, Greg Clayton <[email protected]> wrote:
> Looks ok, a few questions:
> 
> Why does the user ever want to specify the PathSyntax? Shouldn't it just be 
> something that can be queried and be set automagically?
> 
> What if I did:
> 
> FileSpec f("C:\Users\me\foo.txt", false, ePathSyntaxPosix);
> 
> If I set it incorrectly, seems like it might "do the wrong thing" in some 
> calls. It might be nice to use a C++11 typed bitfield to store the PathSyntax 
> to keep the size of the FileSpec class down to as small as possible (uint8_t).
> 
> All the other functionality seems fine.
> 
> A few things to keep in mind:
> 
> All debug info is parsed and stored into FileSpec objects. They must match 
> exactly what the user types. What if the user types in:
> 
> (lldb) breakpoint set --file C:/Users/me/foo.txt
> 
> Note the wrong slashes. Are we alway normalizing a path to be native so that 
> this kind of compare will always work even if the user types it in wrong?
> 
> Seems like we shouldn't allow the user to specify the PathSyntax or we can 
> run into problems.
> 
> Also seems like we should check if a string needs to be normalized before 
> calling Normalize and allocating memory and copying and replacing things 
> which can be expensive (think parsing thousands of line table files).
> 
> So I would vote to:
> 1 - fix it so the user never specifies PathSyntax, but can query it if desired
> 2 - I would rather not normalize the path on every FileSpec::SetFile(...) if 
> we can avoid it as it will be expensive
> 3 - If we do need to normalize it every time, either have Normalize return 
> false and not touch the llvm::SmallString<64> argument (no mallocs) if it 
> doesn't need to be normalized to avoid expensive allocations if not required.
> 
> > On Jul 30, 2014, at 4:49 PM, Zachary Turner <[email protected]> wrote:
> >
> >>> ! In D4675#11, @deepak2427 wrote:
> >> Thanks for fixing this. I haven't had a chance to test this patch with our 
> >> cross-platform issues yet, have to look at some other issues first.
> >> Just a couple of doubts I had.
> >>
> >> - Are the paths always going to default as ePathSyntaxHostNative and then 
> >> converted based on the Host?
> >> - In 733       llvm::SmallString<64> denormalized; , is the length of 64 
> >> enough?
> >> - I'm not that familiar with LLVM data structures, however would it be 
> >> better to keep llvm::SmallString itself for the Normalize/Denormalize 
> >> functions to keep it consistent?
> >
> > Paths default to SyntaxHostNative because that makes the semantics 
> > post-patch equivalent to pre-patch.  Normalized form is basically unix 
> > form, so if the syntax ends up as ePathSyntaxPosix (either because it was 
> > explicitly specified or because host was specified and your host is posix), 
> > then normalization and de-normalization are both no-ops.
> >
> > SmallString<64> just means that it keeps 64 bytes on the stack, and if it 
> > grows longer than that, it will fallback to help allocation.  It still 
> > supports arbitrarily long strings, but paths tend to be small, so that's 
> > why a small number was chosen.
> >
> > http://reviews.llvm.org/D4675
> >
> >
> >
> > _______________________________________________
> > lldb-commits mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
> 
> 

_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits

Reply via email to