Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

Dave Dodge  wrote:
(14/04/2009 21:18)

>> Ok. Point taken about undefined behaviour. Is the "unsigned char *p"
>> declaration enough though?
>
>Yes.  Dereferencing a valid (unsigned char *) will produce an
>(unsigned char), which by definition is safe to pass to isupper.
>
>> One mail suggested using "unsigned" at every subsequent use of the
>> variable.
>
>That's because p was a (char *), and therefore *p was producing a
>possibly-signed value.  Casting the dereferenced value to (unsigned
>char) is another way of ensuring isupper gets a usable value, but I
>think simply changing p to an (unsigned char *) is cleaner.
>

So do I. Thankyou.

>BTW it's worth noting that casting from a signed integer to an
>unsigned integer is a well-defined operation, but casting from
>unsigned to signed is implementation-defined.
>

Sounds familiar. I think I read there were at least three standard ways to 
encode negative numbers. Though I think one of them (Two's complement: up from 
0 to total/2-1, and down from total until you reach total/2), where 'total' is 
the full count of possible values available) strongly dominates the others.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

grischka  wrote:
(14/04/2009 21:04)

>http://repo.or.cz/w/tinycc.git?a=shortlog;h=92c58361
>
>Links to shapsnots are at the rightmost end of the lines.
>

Where are you getting stricmp from? I got some C reference PDF's that say only 
strcmp and people here told me when I asked, that to do a force or an 
insensitive compare I'd have to iterate over each character. You're doing 
something no-one mentioned, so what is it?



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread Dave Dodge
On Tue, Apr 14, 2009 at 06:37:49PM +0100, lostgallifreyan wrote:
> Dave Dodge  wrote:
> >If the filename string contains any high-valued characters, such as
> >accented letters, then accessing it with a char* might produce a
> >negative char value, and passing that to isupper/islower can be a
> >problem.
> 
> Ok. Point taken about undefined behaviour. Is the "unsigned char *p"
> declaration enough though?

Yes.  Dereferencing a valid (unsigned char *) will produce an
(unsigned char), which by definition is safe to pass to isupper.

> One mail suggested using "unsigned" at every subsequent use of the
> variable.

That's because p was a (char *), and therefore *p was producing a
possibly-signed value.  Casting the dereferenced value to (unsigned
char) is another way of ensuring isupper gets a usable value, but I
think simply changing p to an (unsigned char *) is cleaner.

BTW it's worth noting that casting from a signed integer to an
unsigned integer is a well-defined operation, but casting from
unsigned to signed is implementation-defined.

  -Dave Dodge


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread grischka

lostgallifreyan wrote:

However that wouldn't stop windows users from asking why they
can't type "TCC FILE.C", I'm afraid.



Nope, probably not. But they wouldn't have to ask if it were GCC, because those 
paths work there. This is why I kept it simple. This patch is not trying to do 
clever things, just meant to reduce what you seem to want reduced. If you 
decided to add it, that MIGHT stop those questions in future. No way to be sure 
it won't cause others but they'd be fewer, and come from people who'd 
progressed far enough to ask more interesting questions.



Okay, let's move on.
http://repo.or.cz/w/tinycc.git?a=shortlog;h=92c58361

Links to shapsnots are at the rightmost end of the lines.

--- grischka



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

grischka  wrote:
(14/04/2009 18:18)

>Ivo wrote:
>> Why don't you just implement gcc's -x command line option instead of this 
>> ifdeffery? It's also useful on other platforms and it solves the .s/.S 
>> problem, too:
>> 
>> -x c
>> -x cpp-output
>> -x assembler
>> -x assembler-with-cpp
>> 
>> etc...
>> 
>
>I wouldn't say no if we can get it.
>
>However that wouldn't stop windows users from asking why they
>can't type "TCC FILE.C", I'm afraid.
>

Nope, probably not. But they wouldn't have to ask if it were GCC, because those 
paths work there. This is why I kept it simple. This patch is not trying to do 
clever things, just meant to reduce what you seem to want reduced. If you 
decided to add it, that MIGHT stop those questions in future. No way to be sure 
it won't cause others but they'd be fewer, and come from people who'd 
progressed far enough to ask more interesting questions.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

Dave Dodge  wrote:
(14/04/2009 18:13)

>If the filename string contains any high-valued characters, such as
>accented letters, then accessing it with a char* might produce a
>negative char value, and passing that to isupper/islower can be a
>problem.
>

Ok. Point taken about undefined behaviour. Is the "unsigned char *p" 
declaration enough though? One mail suggested using "unsigned" at every 
subsequent use of the variable. Your mail suggested it was ok to do it once at 
declaration. I just forgot to include it while working on the patch, but the 
second posting has it there (as well as a more important fix).



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread grischka

Ivo wrote:
Why don't you just implement gcc's -x command line option instead of this 
ifdeffery? It's also useful on other platforms and it solves the .s/.S 
problem, too:


-x c
-x cpp-output
-x assembler
-x assembler-with-cpp

etc...



I wouldn't say no if we can get it.

However that wouldn't stop windows users from asking why they
can't type "TCC FILE.C", I'm afraid.

--- grischka


--Ivo


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel





___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread Dave Dodge
On Tue, Apr 14, 2009 at 12:21:50PM +0100, lostgallifreyan wrote:
> "Charlie Gordon"  wrote:

> >Also more importantly, do not pass char arguments to
> >tolower/toupper and islower/isupper functions.  'char' may be
> >signed whereas these functions expect an int parameter with a value
> >among those of type unsigned char or EOF.

> As the variable isn't assigned any negative values it's not going to
> be an issue,

If the filename string contains any high-valued characters, such as
accented letters, then accessing it with a char* might produce a
negative char value, and passing that to isupper/islower can be a
problem.

> Actually I forgot the unsigned char bit but even so, it compiled
> without error or warning and it worked.

I think you've mentioned you're new to C, in which case you need to
understand that "compiled without error or warning and it worked"
doesn't mean very much.  C has the concept of "undefined behavior",
which basically means the compiler is allowed to quietly accept the
code but there are no constraints on what it actually does when you
run it.  It might even produce the expected result for some inputs.

Passing a value outside the range of unsigned char (or EOF) to a
function such as isupper is an example of something that produces
undefined behavior.  The program might implicitly cast it to unsigned
char, or change it to 0, or corrupt memory, or crash, or quietly erase
the hard drive and set the printer on fire.  While some of these
actions are perhaps more likely than others, all of them are a correct
result as far as C is concerned.

For example given a blatantly undefined bit of code such as:

  isupper(-20)

gcc does not give a warning, even with -pedantic, -Wall, and -Wextra.
Whatever it actually does when you run it, there is no guarantee that
it will do the same thing in any other C compiler (including other
versions of gcc).

  -Dave Dodge


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan
Regarding the iffdeferry :)
I did that because while I explored tcc.c I noticed it used a lot when 
something was meant to only get added to the executable file if it was compiled 
for Windows. It's there to assure people NOT running Windows that they won't 
ever have to wonder what I did. I assumed from what I saw in the source that 
this was appropriate practise for something specific to Win32.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

Ivo  wrote:
(14/04/2009 17:36)

>On Tuesday 14 April 2009 17:07, lostgallifreyan wrote:
>> What I have considered is using GetLongFileName() in Kernel32.dll to pass
>> long names as specified in the file system. It's a neat answer but only
>> for W98, no good in W95 (and I think redundant in W2K and WXP). Another
>> thing is to add some other option for assembler code files, such as .asm
>> (or .ASM, same thing) for those requiring preprocessing in Windows, and
>> .s (or .S) to be converted in DOS shortname paths so TCC makes the usual
>> interpretation of .s NOT being preprocessed. But I don't know ASM, it's
>> not my call what decision gets made there, and GCC doesn't recognise the
>> extension ASM anyway. For now, if I were to try assembler code I'd just
>> watch carefully what was put into my .s (or .S) files, and make batch
>> scripts with the right extension case for TCC.
>
>Why don't you just implement gcc's -x command line option instead of this 
>ifdeffery? It's also useful on other platforms and it solves the .s/.S 
>problem, too:
>
>-x c
>-x cpp-output
>-x assembler
>-x assembler-with-cpp
>
>etc...
>

I think TCC has that option. If not I'd have to do a lot more than I did. I was 
solving one singular issue only, I wanted a simple DOS shortname path to 
perform as expected in W9X, and I got it. The idea is to present a working TCC 
for Windows newcomers who haven't yet explored deeply. It's also better that 
something as simple as this does work as expected. It works in GCC, so it ought 
to work in TCC. With this patch it does. I deliberately limited to the simplest 
task, because I don't want to break existing handling for more complex matters. 
It's meant to work with TCC, not to reinvent it.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread Ivo
On Tuesday 14 April 2009 17:07, lostgallifreyan wrote:
> What I have considered is using GetLongFileName() in Kernel32.dll to pass
> long names as specified in the file system. It's a neat answer but only
> for W98, no good in W95 (and I think redundant in W2K and WXP). Another
> thing is to add some other option for assembler code files, such as .asm
> (or .ASM, same thing) for those requiring preprocessing in Windows, and
> .s (or .S) to be converted in DOS shortname paths so TCC makes the usual
> interpretation of .s NOT being preprocessed. But I don't know ASM, it's
> not my call what decision gets made there, and GCC doesn't recognise the
> extension ASM anyway. For now, if I were to try assembler code I'd just
> watch carefully what was put into my .s (or .S) files, and make batch
> scripts with the right extension case for TCC.

Why don't you just implement gcc's -x command line option instead of this 
ifdeffery? It's also useful on other platforms and it solves the .s/.S 
problem, too:

-x c
-x cpp-output
-x assembler
-x assembler-with-cpp

etc...

--Ivo


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan
Actually I made a horrible howler that you didn't spot. :) I never noticed it 
till I tried passing two files to compile just now. TCC went into a silent 
CPU-consuming spin, I think because my test for lowercase called the 
case-forcing loop for *every character* found in that first loop until a 
lowercase one was found. I had it right before I posted that first attempt at a 
patch, but had tried to reduce it, and reduced it too far. The earlier one used 
a flag variable. It definitely works as advertised, even for multiple files.


#ifdef _WIN32
/* Set W9X DOS 8.3 upper case paths to lower case. */
unsigned char *p;
int flag = 1;
if (r[1] == ':' && r[2] == '\\') {
for (p = r; *p; p++)
if (flag && *p != toupper(*p))
flag = 0;
if (flag == 1)
for (p = r; *p; p++)
*p = tolower(*p);
}
#endif



Also, when you mention the distinction in Windows for case sensitive 
extensions, you know it can't exist, at least for the same file name in the 
same location... So when it comes to .s and .S, I don't know what to do. That's 
why I won't over-ride expected behaviour if someone wants a command line 
written accordingly, they just make sure (if using Windows) that the 
commandline has a lower case letter on the path as written (or a forward slash 
after the colon). I described this clearly already.

What I have considered is using GetLongFileName() in Kernel32.dll to pass long 
names as specified in the file system. It's a neat answer but only for W98, no 
good in W95 (and I think redundant in W2K and WXP). Another thing is to add 
some other option for assembler code files, such as .asm (or .ASM, same thing) 
for those requiring preprocessing in Windows, and .s (or .S) to be converted in 
DOS shortname paths so TCC makes the usual interpretation of .s NOT being 
preprocessed. But I don't know ASM, it's not my call what decision gets made 
there, and GCC doesn't recognise the extension ASM anyway. For now, if I were 
to try assembler code I'd just watch carefully what was put into my .s (or .S) 
files, and make batch scripts with the right extension case for TCC.

To summarise, the various interpretations of case sensitive extensions is a 
minefield the moment you use a file system that does not recognise case 
sensitivity. I think the best that can be done is to make sure that DOS 
shortname paths do at least work most of the time, using a method that makes it 
easy to pass written arguments to TCC with case-based determinations that work 
regardless of what case they have in the file system. So that's why I wrote the 
patch as I did, to cleanly solve one small, specific problem.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread lostgallifreyan

"Charlie Gordon"  wrote:
(14/04/2009 11:28)

>I don't think this is going in the right direction:
>- Command line arguments should not be changed
>- Why restrict this behaviour to full paths with a drive letter ?
>- instead of matching filename extensions in a case independent manner ?
>

You miss the point. Look at the GCC specs for commandlines, as I was told to 
do. How do you propose to handle .s and .S on Windows? The patch would bloat 
uncontrollably if any attempt was made to solve this that way, and code for 
that exists already, the written arguments' case is taken literally by TCC. As 
TCC tends to use lower case commands and filespecs, except where upper case is 
required, the simplest thing to do is make the DOS paths lower case, while 
testing for existing lowercase to disable the patch if any is found. I'm a 
newcomer to C, and TCC, yet was urged to jump in the deep end and suggest a 
solution! So that is what I did, and I had to figure out the best placement of 
the patch for myself. And it works. If you want to specify an exact case for an 
extension, you can. All this patch is meant to do is prevent DOS's all-caps 
habits from breaking filespecs as normally expected by TCC, while being 
irrelevant to non-windows users, and being easily bypassed by Windows users.

>Also more importantly, do not pass char arguments to tolower/toupper and
>islower/isupper functions.  'char' may be signed whereas these functions 
>expect
>an int parameter with a value among those of type unsigned char or EOF.
>The argument to these functions should at least be cast to (unsigned char),
>and so should the value you compare their result to.
>
>>if (*p != toupper(*p))
>
>if ((unsigned char)*p != toupper((unsigned char)*p)
>
>>*p = tolower(*p);
>
>*p = tolower((unsigned char)*p);
>
>If you don't like the resulting code (quite ugly in my opinion),
>you should use a temporary variable:
>
>int c = (unsigned char)*p;
>if (toupper(c) != c) ...
>
>*p = tolower(c);
>

I went with advice given, as best I could. As the variable isn't assigned any 
negative values it's not going to be an issue, as I guess was why I was advised 
as I was. Actually I forgot the unsigned char bit but even so, it compiled 
without error or warning and it worked. Not sure why you put all those 
"unsigned char"s in there like that, or want an extra variable, I just tried 
"unsigned char *p;" as the initial declaration and it worked fine. (See Dave 
Dodge's mail in the earlier thread, he does the same).


>But all things considered, I still argue that the correct approach to the 
>problem
>you are trying to fix, is to perform case insensitive matching on filename 
>extensions
>wherever these occur.  A simple way to do this is to identify all 
>occurrences of such
>matching and use call function with a platform dependent implementation.
>

No, logically it's the same thing. Though while I could have forced case only 
on the TEST value that made no sense in practise, as DOS already does force 
case in shortname paths! I would have not have solved the problem by doing the 
same thing, or its inverse. And to do so goes right against the GCC specs that 
TCC is emulating. The correct way is to force lower case for this singular 
instance to reverse Windows' forced upper case. Keep it simple. If further 
refined exceptions must be made, add them. For example, as .c requires 
preprocessing, while .S as opposed to .s also requires it for assembler code, 
it might be worth adding a line to re-forece the singular extension .s to upper 
case on Windows to invoke the right response from TCC. But in the interests of 
keeping code size down, it seems wise to limit the patch's behaviour to a 
simple easily predicted action that is extremely easy to over-ride in the 
extremely limited situations where it can even act at all. :) This I have done.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

2009-04-14 Thread Charlie Gordon
- Original Message - 
From: "lostgallifreyan" 

To: 
Sent: Sunday, April 12, 2009 5:02 PM
Subject: [Tinycc-devel] Basic patch for passing W9X short DOS paths to TCC.

As promised, here's a basic patch for allowing files passed to TCC to work 
in W9X.

It can be easily overridden for usual case sensitive file extension
handling, but in Windows it's up to the user to see that the file
pointed to contains the right stuff, especially in instances of ASM
coding where .s and .S are important. You can't specify which in the
file name, but you can still do it in the written path, even with this 
patch.


Immediately below this line in tcc.c:
   /* add a new file */
I added this:
#ifdef _WIN32
   /* Set W9X DOS 8.3 upper case paths to lower case. */
   char *p;
   if (r[1] == ':' && r[2] == '\\') {
   for (p = r; *p; p++) {
   if (*p != toupper(*p))
   break;
   else
   for (p = r; *p; p++)
   *p = tolower(*p);
   }
   }
#endif

I considered a command line switch as suggested by a couple of
people in the earlier thread, but I think this is neater, it only
operates in very limited circumstances. If you want to be sure it does
nothing, make a forward slash instead of that first backslash, or a
lower case letter on the path, any of these will stop it acting, it
ONLY does something if it looks like a standard all caps path, so be
aware that if you like all-caps long-name paths you need to think
about that if you try this patch...

My test for the colon and backslash in Windows paths might even be
redundant given the #ifdef_WIN32 bit, but I'm playing it safe in case
Windows Mobile is considered as Win32, because it doesn't use a single
drive letter at the start of a path. That backslash test is a useful
tool, it allows the change to a forward slash to disable the patch as
a command line switch might do.


I don't think this is going in the right direction:
- Command line arguments should not be changed
- Why restrict this behaviour to full paths with a drive letter ?
- instead of matching filename extensions in a case independent manner ?

Also more importantly, do not pass char arguments to tolower/toupper and
islower/isupper functions.  'char' may be signed whereas these functions 
expect

an int parameter with a value among those of type unsigned char or EOF.
The argument to these functions should at least be cast to (unsigned char),
and so should the value you compare their result to.


   if (*p != toupper(*p))


if ((unsigned char)*p != toupper((unsigned char)*p)


   *p = tolower(*p);


*p = tolower((unsigned char)*p);

If you don't like the resulting code (quite ugly in my opinion),
you should use a temporary variable:

int c = (unsigned char)*p;
if (toupper(c) != c) ...

*p = tolower(c);

But all things considered, I still argue that the correct approach to the 
problem
you are trying to fix, is to perform case insensitive matching on filename 
extensions
wherever these occur.  A simple way to do this is to identify all 
occurrences of such

matching and use call function with a platform dependent implementation.

--
Chqrlie.



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/tinycc-devel