jingham added a comment.

In https://reviews.llvm.org/D26883#600683, @labath wrote:

> I don't know how deep do you want this refactor to be, but there is one issue 
> I would like us to consider, if only to decide it is out of scope of this 
> change. I am talking about the `quote_char` thingy. The main problem for me 
> is that I don't think it's possible to sanely define the meaning of that 
> field. According to POSIX quoting rules (which our command line more-or-less 
> follows) a single argument can be quoted in a great many ways, using various 
> combinations of quote characters. For example, these are all valid ways to 
> represent the argument `asdf` in a POSIX shell:
>
>   asdf
>   "asdf"
>   'asdf'
>   a"sd"f
>   "as"df
>   "as""df"
>   "as"'df'
>   "a"s'd'"f"
>   ... (you get my point)
>
>
> I don't think there is a self-consistent way to define what the `quote_char` 
> field will be for each of these options. Moreover, I don't see why one would 
> ever need to use that field. It can only encourage someone to try to "quote" 
> the argument by doing `quote_char+value+quote_char`, which is absolutely 
> wrong if you ever want that result to be machine parsable.(*) For proper 
> quoting I think we should just have a free-standing `std::string 
> quote_for_posix_shell(llvm::StringRef)` function (and maybe 
> `quote_for_windows_cmd`, and whatever else quoting scheme we need), and then 
> the user can decide which one to use based on who is going to be consuming 
> it. Then we can just kill the `quote` field. The only thing is... I have no 
> idea how much work that will be (but I am ready to chip in to make it happen).
>
> So, yea, if we decide not to do that, then I think the interface looks great. 
> Otherwise, I think we can design a slightly simpler (and more consistent) one.
>
> (*) Bonus question: Try to start an executable under lldb, so that in enters 
> `main()` with `argc=2` and `argv[1]="'"` I.e.,  as if it had been started 
> this way via bash:
>
>   $ /bin/cat \'
>


The outermost quote character is syntactically significant in the lldb command 
language.  If you say:

memory read -c `count_var` 0x123345

then lldb evaluates the expression in the backticks, replacing the argument 
with the result of the expression.  So you can't get rid of the quote character 
altogether.  The other use is in completion to figure out what an unterminated 
quoted string should be completed with.

Since only the outermost quoting character is important, I think the problem is 
more tractable than your examples would suggest.


https://reviews.llvm.org/D26883



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to