Hi! On 2016-05-07 09:45, David Allsopp wrote: > Andrey Repin wrote: >> Greetings, David Allsopp! > > And greetings to you, too! > > <snip> > >>> I'm not using cmd, or any shell for that matter (that's actually the >>> point) - I am in a native Win32 process invoking a Cygwin process >>> directly using the Windows API's CreateProcess call. As it happens, >>> the program I have already has the arguments for the Cygwin process >>> in an array, but Windows internally requires a single command line >>> string (which is not in any related to Cmd). >> >> Then all you need is a rudimentary quoting. > > Yes, but the question still remains what that rudimentary quoting is - i.e. > I can see how to quote spaces which appear in elements of argv, but I cannot > see how to quote double quotes! > >> The rest will be handled by getopt when the command line is parsed. > > That's outside my required level - I'm interested in Cygwin's emulation > handling the difference between an operating system which actually passes > argc and argv when creating processes (Posix exec/spawn) and Windows (which > only passes a single string command line). The Microsoft C Runtime and > Windows have a "clear" (at least by MS standards) specification of how that > single string gets converted to argv, I'm trying to determine Cygwin's - > getopt definitely isn't part of that. > >>>> However, I've found Windows's interpretation to be inconsistent, so >>>> often have to play with it to find what the "right combination" is >>>> for a particular instance. >>>> >>>> I find echoing the parameters to a temporary text file and then >>>> using the file as input to be more reliable and easier to >>>> troubleshoot, and it breaks apart whether it is Windows cli >>>> inconsistencies or receiving program issues very nicely with the >>>> text file content as an intermediary >> >>> This is an OK tack, but I don't wish to do this by experimentation >>> and get caught out later by a case I didn't think of, so what I'm >>> trying to determine is *exactly* how the Cygwin DLL processes the >>> command line via its source code so that I can present it with my >>> argv array converted to a single command line and be certain that >>> the Cygwin will >> recover the same argv DLL. >> >>> My reading of the relevant sources suggests that with globbing >>> disabled, backslash escape sequences are *never* interpreted (since >>> the quote function returns early - dcrt0.cc, line 171). If there is >>> no way of encoding the double quote character, then perhaps I have >>> to run with globbing enabled but ensure that the globify function >>> will never actually expand anything - but as that's a lot of work, I >>> was wondering >> if I was missing something with the simpler "noglob" case. >> >> The point being, when you pass the shell and enter direct process >> execution, you don't need much of shell magic at all. >> Shell conventions designed to ease interaction between system and >> operator. >> But you have a system talking to the system, you can be very literal. > > Indeed, which is why I'm trying to avoid the shell! But I can't be entirely > literal, because Posix and Windows are not compatible, so I need to > determine precisely how Cygwin's emulation works... and so far, it doesn't > seem to be a terribly clearly defined animal! > > So, resorting to C files to try to demonstrate it further. spawn.cc seems to > suggest that there should be some kind of escaping available, but I'm > struggling to follow the code. Consider these two: > > callee.c > #include <stdio.h> > int main (int argc, char* argv[]) > { > int i; > > printf("argc = %d\n", argc); > for (i = 0; i < argc; i++) { > printf("argv[%d] = %s\n", i, *argv++); > } > return 0; > } > > caller.c > #include <windows.h> > #include <stdio.h> > > int main (void) > { > LPTSTR commandLine; > STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL}; > PROCESS_INFORMATION process = {NULL, NULL, 0, 0}; > > commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz *"; > if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0, > NULL, NULL, &startupInfo, &process)) { > printf("Error spawning process!\n"); > return 1; > } else { > WaitForSingleObject(process.hProcess, INFINITE); > CloseHandle(process.hThread); > CloseHandle(process.hProcess); > return 0; > } > } > > If you compile as follows: > > $ gcc -o callee callee.c > $ i686-w64-mingw32-gcc -o caller caller.c > $ export CYGWIN=noglob # Or the * will be expanded > $ ./caller > > and the output is as required: > argc = 6 > argv[0] = callee > argv[1] = @te > st > argv[2] = fo@o > argv[3] = bar baz > argv[4] = fliggle > argv[5] = * > > But if I want to embed an actual " character in any of those arguments, I > cannot see any way to escape it which actually works at the moment. For > example, if you change commandLine in caller.c to be "callee.exe test\\\" > argument" then the erroneous output is: > > argc = 2 > argv[0] = callee > argv[1] = test\ argument > > where the required output is > > argc = 3 > argv[0] = callee > argv[1] = test" > argv[2] = argument > > Any further clues appreciated. Is it actually even a bug?!
I think cygwin emulates posix shell style command line parsing when invoked from a Win32 process (like you do). So, try single quotes: commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz '*' '\"\\'\"'"; I get this (w/o noglob): argc = 7 argv[0] = callee argv[1] = @te st argv[2] = fo@o argv[3] = bar baz argv[4] = baz argv[5] = * argv[6] = "'" Cheers, Peter -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple