Re: compiling C w/cygwin vs. -mno-cygwin; inconsistent C behavior

2008-03-17 Thread Brian Dessent
Linda Walsh wrote:

 or the nocyg, it will try to edit 3 files.  When I am invoking
 the redirector, I'm using 1 set of double quotes:
 gvim file with space
 in both versions w/cyg  w/o-cyg.  cmd.exe also requires quoting filenames 
 with
 spaces the same as bash.

When you use plain unadorned double quotes, they tell the shell that
you're typing the command into how to group arguments, but they do not
exist past there.  In other words, bash uses the quotes to reconstruct
that you want argv[1] to equal 'file with space' but argv[1] itself does
not contain any quotes.  So if you then turn around and pass that
argv[1] on to the MSVCRT exec() which does no quoting, the grouping is
lost.

 For some reason, the multiple args in the redirector are being merged -- I
 I can get the quoting to work in the no-cyg case by using 2 sets of quotes:
 gvim 'file with space'

When you quote quotes, that means the shell sees them as integral parts
of the command rather than metacharacters that are to be interpreted and
discarded.  So the invoked program will have them in its argv.  You
could also use \file with space\ if you wanted to expand variables
inside the string.

 So cygwin pays attention, and my re-passing the args via execv
 preserves the quoting of filenames, but the no-cyg version appears
 to take the argv[1..#] arguments and merge them into 1 argument.
 To do the same in the no-cyg version, I'd have to peel each arg
 off, put quotes around it, then call execv with everything requoted.

Or just link your MinGW version with the MinGW -lexecwrap library.

 So you are saying that when I call execv in cygwin, it unpacks
 my 'argv', and makes a new 'argv' with quotes around each string?
 and that is what gets passed to create process?  Does it use
 single or double quotes?

Not quite.  What Cygwin does depends on whether it's exec()ing a Cygwin
binary or a non-Cygwin binary.  In the case of a Cygwin binary this is
all moot because the argv is handed directly to the child through
internal communications, bypassing Windows, so there is no need for any
quoting at all.  When a Cygwin binary exec()s a non-Cygwin binary it
first constructs an approproate command line that concatenates argv,
inserting quotes around any elements that contain whitespace.  Note
again that it does not create a new argv because there is no such thing
as an argv in Windows: what a child gets is a command line, and if it
wants it in the form of argv it has to synthesize it from that.

 But the executed program, with or without cygwin, already has the
 arguments parsed when main is invoked.
 I.e. when either the cygwin or no-cyg program is invoked, they both
 have the same view of the arguments in 'argv'.

main() is a fiction that is invented by the CRT startup to support the C
language.  It is by no means the actual entrypoint of the program.  One
of the functions of the CRT startup code is to retrieve the command line
from the operating system and parse it into words to populate argv. 
That doesn't mean argv was passed to the process, it just means that it
will be synthesized for code that expects to have one.  This is optional
by the way.  If you want to write a traditional Win32 app the entrypoint
is WinMain:

#include windows.h
#include stdio.h

int WINAPI
WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)
{
  puts (Hello world.);
  return 0;
}

You can compile this both with Cygwin and MinGW and it will work fine. 
Note that the parameters passed to the program are nothing like the C
argc/argv.  lpCmd is a pointer to a null terminated string containing
the command line, there is no array of arguments anywhere.

 It's execv that's falling down, not doing it's job.  My arguments
 are already parsed and separated, but the no-cyg version of execv
 is mushing them all back together, while cygwin invokes the
 next program, apparently with quotes of some sort, around the
 contents of each, separate, argv[] string.

It would not be the first time that someone found MSVCRT less than
adequate.  Again, the MinGW project has a convenient set of wrappers for
just this reason.

 Isn't MSVCRT the startup code?

No, it's just the opposite: it is the C library minus the startup code. 
The startup code is linked in with each binary, whereas MSVCRT.DLL is
the common library code.  When you use MinGW (= use -mno-cygwin) you are
using the MinGW project's startup code but everything else is MSVCRT. 
Including execv().

 I don't think it is a MS problem exactly -- it appears to be a
 broken implementation of execv.  When I call execv, the different

And whose implementation of execv() do you think that is exactly?  It's
not Cygwin's.  It's not MinGW's.  It's certainly not gcc's.  It's
Microsoft's.  Again, this is the whole point of MinGW, to use the
existing Microsoft C library of the operating system so that the program
can run without any accompanying libraries.

 I'd say that the no-cyg version of execv isn't maintaining the separation
 

Re: compiling C w/cygwin vs. -mno-cygwin; inconsistent C behavior

2008-03-16 Thread Linda Walsh

Brian Dessent wrote:

Linda Walsh wrote:


When I use the no-cygwin version, filenames with spaces in them get split into
separate arguments, but if I run the cygwin version, the file name isn't split
on space boundaries.

I'm 'guessing', but shouldn't the breaking of apart of arguments behave
the same whether I compile with cygwin or -mno-cygwin?


No, what you're seeing is totally expected behavior.  In native windows
if you want to support filenames with spaces, you have to include
physical quote characters in the command line.


Well that's just special.  But something doesn't add up.  How
could the cygwin wrapper know what character strings to bind together as a 
filename?  If I say gvim file with space, in either the cygwin version

or the nocyg, it will try to edit 3 files.  When I am invoking
the redirector, I'm using 1 set of double quotes:
gvim file with space
in both versions w/cyg  w/o-cyg.  cmd.exe also requires quoting filenames with 
spaces the same as bash.


For some reason, the multiple args in the redirector are being merged -- I
I can get the quoting to work in the no-cyg case by using 2 sets of quotes:
gvim 'file with space'

So cygwin pays attention, and my re-passing the args via execv
preserves the quoting of filenames, but the no-cyg version appears
to take the argv[1..#] arguments and merge them into 1 argument.
To do the same in the no-cyg version, I'd have to peel each arg
off, put quotes around it, then call execv with everything requoted.

So 'cygwin' appears to requote arguments that are passed 'in' whereas
the no-cyg version does not.  Explaining another way:

I have main(int argc,char * const argv[]);
I execute the program like this:
gvim[+/-]cygwin.exe Filename with space.
In both the +cyg and the -cyg case, I get an argc value of 2.
With both, I get some form of the program name in argv[0], and then
I get the Filename with space (without the quotes) in argv[1].

Then I call execv(cmd, argv);

So you are saying that when I call execv in cygwin, it unpacks
my 'argv', and makes a new 'argv' with quotes around each string?
and that is what gets passed to create process?  Does it use
single or double quotes?


That's because
CreateProcess does not actually have an argv, there is no such thing as
an argv in windows -- a process gets created with a monolithic command
line.  If it wants that command line in the form of individual
arguments, it has to parse it (or ask the system/CRT to parse it for
it.) 


But the executed program, with or without cygwin, already has the
arguments parsed when main is invoked.
I.e. when either the cygwin or no-cyg program is invoked, they both
have the same view of the arguments in 'argv'.

It's execv that's falling down, not doing it's job.  My arguments
are already parsed and separated, but the no-cyg version of execv
is mushing them all back together, while cygwin invokes the
next program, apparently with quotes of some sort, around the
contents of each, separate, argv[] string.




That means that the only way to make arguments with spaces survive

intact is by quoting.


Right ...


And Cygwin does that quoting for you.  The native runtime MSVCRT does
not, which is what is executing when you're using -mno-cygwin. 


Isn't MSVCRT the startup code?
 If you

don't like its behavior then take it up with Microsoft, it's out of our
hands.  There is no guarantee of consistency whatsoever, because
-mno-cygwin literally means don't use any Cygwin, use the Microsoft
runtime (MinGW).


I don't think it is a MS problem exactly -- it appears to be a
broken implementation of execv.  When I call execv, the different
file arguments are already collected together, with 1 file name
per 'argv[]' array element.  Cygwin honors arguments that are
collected together in 1-string (pointer of which is passed in
1 argv element) and somehow calls the desired program maintaining
the grouping of the contents of each argv element.  You say that
the standard way to do that is to quote the contents of each full argv
string separately, 1 quoted string/argv element.

I'd say that the no-cyg version of execv isn't maintaining the separation
of the arguments.  It's just mushing them all together and not passing
the correct argument separation to createprocess.
Am I missing something?


For what it's worth the MinGW project has a set of exec() wrappers that
help to sanitize the situation a little.

---
I think that's where the problem is.

I've also run into another unexpected behavior difference.
The no-cyg version of _my_ gcc wrapper, is automatically detaching
from the terminal and going into the background (running under bash).
Is there some reason cygwin waits around until my wrapper is finished,
but the no-cyg version (same code) doesn't?I suppose it is
easier when launching another program with createprocess -- to hmmm...
shouldn't the exec call end or terminate the execution of the wrapper?

Since 

Re: compiling C w/cygwin vs. -mno-cygwin; inconsistent C behavior

2008-03-16 Thread Greg Chicares
On 2008-03-16 23:00Z, Linda Walsh wrote:

   Isn't MSVCRT the startup code?

It's not the startup code; it's the C runtime library. When you
build your program with '-mno-cygwin', it links to the runtime
that ms provides along with the operating system, so it inherits
any shortcomings of that implementation.


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/