Re: [Tinycc-devel] GAS symbols

2016-04-07 Thread Sergey Korshunoff
> PS: what is the need to have '.' in asm idents?
Looks like there is a need to have a switch like -fdots-in-asm-idents

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-07 Thread Sergey Korshunoff
Jusy descovered. While a test is OK, tccboot hangs. It boots OK with
current mob when patch for '.' in *.S is reversed.

PS: what is the need to have '.' in asm idents?

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-05 Thread Sergey Korshunoff
> but I don't see where do you restore its prev value for the rest of the input?
No need to do this.  There is additional explicit check for '.'

parse_num:
for(;;) {
t = c;
cstr_ccat(, c);
PEEKC(c, p);
if (!((isidnum_table[c - CH_EOF] & (IS_ID|IS_NUM))
  || c == '.'
  || ((c == '+' || c == '-')
  && (t == 'e' || t == 'E' || t == 'p' || t == 'P')
  )))
break;
}

All other places check for IS_SPC.

PS: There is much cleaner solution (but it is a bit slower): change
isidnum_table when we change parse_flags (in #define)

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-05 Thread Sergey Korshunoff
> This is weird. Could be the second table introduces some cache misses but
> probably someone else might chime in with better explanation and/or
> suggestion how to improve this.

A first version of the patch with speed optimization is pushed to the [mob].
A compilation speed is not changed by this version of the patch.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-04 Thread Vladimir Vissoultchev
This is weird. Could be the second table introduces some cache misses but
probably someone else might chime in with better explanation and/or
suggestion how to improve this.

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Sergey Korshunoff
Sent: Monday, April 4, 2016 6:09 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

> Thanks. I will test a speed of two versions on tccboot
It looks like a version of the patch with 2 tables is a bit slower

2 tables on AMD v140 2.2 GHz with 2 GiB of RAM
   37918 idents, 6652803 lines, 193678862 bytes, 8.097 s, 821590 lines/s,
23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.098 s, 821534 lines/s,
23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.096 s, 821733 lines/s,
23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.096 s, 821705 lines/s,
23.9 MB/s

first version:
37918 idents, 6652803 lines, 193678862 bytes, 8.063 s, 825131 lines/s,
24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.068 s, 824558 lines/s,
24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.058 s, 825625 lines/s,
24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.069 s, 824485 lines/s,
24.0 MB/s

PS: (Nov 8, 2004) TCC version 0.9.22 is out (Changelog). Linux kernel
compilation is 30% faster (10 seconds on a 2.4 GHz Pentium 4).

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-04 Thread Sergey Korshunoff
> PS: (Nov 8, 2004) TCC version 0.9.22 is out (Changelog). Linux kernel
> compilation is 30% faster (10 seconds on a 2.4 GHz Pentium 4).

Looks like gcc-3.4.6 optimize better than one used in the above test.
A current tcc compiled by itself is 2 times slower

37891 idents, 6652803 lines, 193678862 bytes, 13.503 s, 492685
lines/s, 14.3 MB/s
37891 idents, 6652803 lines, 193678862 bytes, 13.507 s, 492531
lines/s, 14.3 MB/s

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-04 Thread Sergey Korshunoff
> Thanks. I will test a speed of two versions on tccboot
It looks like a version of the patch with 2 tables is a bit slower

2 tables on AMD v140 2.2 GHz with 2 GiB of RAM
   37918 idents, 6652803 lines, 193678862 bytes, 8.097 s, 821590
lines/s, 23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.098 s, 821534
lines/s, 23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.096 s, 821733
lines/s, 23.9 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.096 s, 821705
lines/s, 23.9 MB/s

first version:
37918 idents, 6652803 lines, 193678862 bytes, 8.063 s, 825131
lines/s, 24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.068 s, 824558
lines/s, 24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.058 s, 825625
lines/s, 24.0 MB/s
37918 idents, 6652803 lines, 193678862 bytes, 8.069 s, 824485
lines/s, 24.0 MB/s

PS: (Nov 8, 2004) TCC version 0.9.22 is out (Changelog). Linux kernel
compilation is 30% faster (10 seconds on a 2.4 GHz Pentium 4).

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-04 Thread Vladimir Vissoultchev
> How to download only a diff form a github (my browser os quite old and don't 
> supported by github)? Can you post your diffs?

 

I amended the original commit so the original link seems to be dead (it's not 
your browser). Try the new one at 
https://github.com/wqweto/tinycc/commit/bead113587c06f9670a045b1ea95373503f55f41

 

If you need diff then just add .diff to the url like this `curl 
https://github.com/wqweto/tinycc/commit/bead113587c06f9670a045b1ea95373503f55f41.diff
 -o new.diff`

 

cheers,



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-03 Thread Sergey Korshunoff
> Here is the fix that switches off '.' for identifiers when parsing
#defines
> in .S files.
> https://github.com/wqweto/tinycc/commit
<https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b1c34>
2ec8d7068aeed7b83c9d4f30aa686586182b1c34
<https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b1c34>

How to download only a diff form a github (my browser os quite old and
don't supported by github)? Can you post your diffs?

2016-04-03 20:46 GMT+03:00 Vladimir Vissoultchev <wqw...@gmail.com>:

> Hi,
>
> Here is the fix that switches off '.' for identifiers when parsing #defines
> in .S files.
>
>
> https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b
> 1c34
> <https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b%0A1c34>
>
> This includes an originally failing test with your SRC(y...) macro for
> regression testing.
>
> I have reset your [2bf43b5] `reverse of the "Identifiers can start and/or
> contain '.'"` commit in my [dev] branch to be able to go forward (need to
> be
> able to parse '.' in asm labels).
>
> cheers,
> 
>
> -Original Message-
> From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
> [mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
> Sergey Korshunoff
> Sent: Wednesday, March 30, 2016 12:34 PM
> To: tinycc-devel@nongnu.org
> Subject: Re: [Tinycc-devel] GAS symbols
>
> > or switching PARSE_FLAG_ASM_FILE off for preprocessor directives in .S
> > files
> Thia solution looks better (maybe only for #defune) because a test program
> preprocessed fine if we rename it to *.c
>
> ___
> Tinycc-devel mailing list
> Tinycc-devel@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>
>
> ___
> Tinycc-devel mailing list
> Tinycc-devel@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>
___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-03 Thread Vladimir Vissoultchev
https://github.com/wqweto/tinycc/commit/bead113587c06f9670a045b1ea95373503f5
5f41

ok, I just amended last commit w/ two tables approach to rule out the perf
influence.

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Michael Matz
Sent: Sunday, April 3, 2016 8:58 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

Hi,

On Sun, 3 Apr 2016, Vladimir Vissoultchev wrote:

> Here is the fix that switches off '.' for identifiers when parsing 
> #defines in .S files.
>
> https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa6865
> 86182b
> 1c34
>
> This includes an originally failing test with your SRC(y...) macro for 
> regression testing.
>
> I have reset your [2bf43b5] `reverse of the "Identifiers can start 
> and/or contain '.'"` commit in my [dev] branch to be able to go 
> forward (need to be able to parse '.' in asm labels).

Yeah, that looks like better progress than a simple revert.  One concern: 
the tokenizer is already the slowest thing in tcc; if you can't measure a
timing difference (say on the three-pass ONE_SOURCE tcc self-compile) it's
fine, but if you can, one suggestion would be to actually have two
idnum_table tables and switch those back and forth, instead of the flag and
checking for the '.' case explicitely.  (Probably there's not much
difference, but I'd like to make sure)

Please push to mob either way after a measurement.  (without win32/libtcc
hunks, those would be a separate commit if necessary).  (You have to
explicitely revert seyko2s revert then)


Ciao,
Michael.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-03 Thread Michael Matz

Hi,

On Sun, 3 Apr 2016, Vladimir Vissoultchev wrote:

Here is the fix that switches off '.' for identifiers when parsing 
#defines in .S files.


https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b 
1c34


This includes an originally failing test with your SRC(y...) macro for
regression testing.

I have reset your [2bf43b5] `reverse of the "Identifiers can start and/or
contain '.'"` commit in my [dev] branch to be able to go forward (need to be
able to parse '.' in asm labels).


Yeah, that looks like better progress than a simple revert.  One concern: 
the tokenizer is already the slowest thing in tcc; if you can't measure a 
timing difference (say on the three-pass ONE_SOURCE tcc self-compile) it's 
fine, but if you can, one suggestion would be to actually have two 
idnum_table tables and switch those back and forth, instead of the flag 
and checking for the '.' case explicitely.  (Probably there's not much 
difference, but I'd like to make sure)


Please push to mob either way after a measurement.  (without win32/libtcc 
hunks, those would be a separate commit if necessary).  (You have to 
explicitely revert seyko2s revert then)



Ciao,
Michael.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-04-03 Thread Vladimir Vissoultchev
Hi,

Here is the fix that switches off '.' for identifiers when parsing #defines
in .S files.

https://github.com/wqweto/tinycc/commit/2ec8d7068aeed7b83c9d4f30aa686586182b
1c34

This includes an originally failing test with your SRC(y...) macro for
regression testing.

I have reset your [2bf43b5] `reverse of the "Identifiers can start and/or
contain '.'"` commit in my [dev] branch to be able to go forward (need to be
able to parse '.' in asm labels).

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Sergey Korshunoff
Sent: Wednesday, March 30, 2016 12:34 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

> or switching PARSE_FLAG_ASM_FILE off for preprocessor directives in .S 
> files
Thia solution looks better (maybe only for #defune) because a test program
preprocessed fine if we rename it to *.c

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-31 Thread Vladimir Vissoultchev
> Another qiestion: what purpose of the following change?
>
> if (tok != TOK_STR)
>  expect("string constant");
>  next();
> +if (tok == ',') {
> +next();
> +if (tok == '@' || tok == '%')
> +next();
> +next();
> +}

 

For ELF targets, the .section directive is used like this:

 

.section name [, "flags"[, @type[,flag_specific_arguments]]]

 

Had some failing .S file with @type supplied on section directives. This just 
ignores it (but not `flag_specific_arguments` it seems).

 

cheers,



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-30 Thread Sergey Korshunoff
> The original problem with the '.' is GAS was that tcc couldn't deal with
labels
> starting '.'

Is this change works (with your patch reverted)?

case '.':
/* special dot handling because it can also start a number */
PEEKC(c, p);
if (isnum(c)) {
cstr_reset();
cstr_ccat(, '.');
goto parse_num;
-} else if ((isidnum_table['.' - CH_EOF] & IS_ID) != 0) { /* asm
mode */
+   } else if (parse_flags & PARSE_FLAG_ASM_FILE) { /* asm mode */
*--p = c = '.';
goto parse_ident_fast;


Another qiestion: what purpose of the following change?

if (tok != TOK_STR)
 expect("string constant");
 next();
+if (tok == ',') {
+next();
+if (tok == '@' || tok == '%')
+next();
+next();
+}
 }
 last_text_section = cur_text_section;
 use_section(s1, sname);
 }
 break;

> f you do revert this change, please add a test for the SRC(y...) that
passes on mob branch.

Not so simple: a file must be a *.S and makefile changes needed



2016-03-31 1:09 GMT+03:00 Vladimir Vissoultchev <wqw...@gmail.com>:

> The original problem with the '.' is GAS was that tcc couldn't deal with
> labels starting '.' like
>
> .start:
>   <>
> .end:
>
> . . . and gcc -S seem to produce lots of these.
>
> Unfortunately there is no failing test for this yet implemented so your
> revert seems to pass all tests present.
>
> If you do revert this change, please add a test for the SRC(y...) that
> passes on mob branch.
>
> cheers,
> 
>
> -Original Message-
> From: tinycc-devel-bounces+wqweto=gmail@nongnu.org [mailto:
> tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of Sergey
> Korshunoff
> Sent: Wednesday, March 30, 2016 11:54 PM
> To: tinycc-devel@nongnu.org
> Subject: Re: [Tinycc-devel] GAS symbols
>
> > First version of the patch is attached. Please test.
> Don't works. I want to reverse '.' in asm indentifers. A patch attached,
>
>
> ___
> Tinycc-devel mailing list
> Tinycc-devel@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>
___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-30 Thread Vladimir Vissoultchev
The original problem with the '.' is GAS was that tcc couldn't deal with labels 
starting '.' like

.start:
  <>
.end:

. . . and gcc -S seem to produce lots of these.

Unfortunately there is no failing test for this yet implemented so your revert 
seems to pass all tests present.

If you do revert this change, please add a test for the SRC(y...) that passes 
on mob branch.

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org 
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of Sergey 
Korshunoff
Sent: Wednesday, March 30, 2016 11:54 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

> First version of the patch is attached. Please test.
Don't works. I want to reverse '.' in asm indentifers. A patch attached,


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-30 Thread Sergey Korshunoff
> or switching PARSE_FLAG_ASM_FILE off for preprocessor directives in .S files
Thia solution looks better (maybe only for #defune) because a test
program preprocessed fine if we rename it to *.c

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-29 Thread Vladimir Vissoultchev
Hi,

Unfortunately the offending fix of mine causes 'y...' to be treated as a
single identifier instead of y followed by TOK_DOTS which is gcc extension
to designate 'y' to become __VA_ARGS__ replacement. Nasty!

If going forward with the commit the plan is in tccpp.c around line 1337
(nice line:-)) to check if an identifier is ending with '...' and set
is_vaargs instead of current check for TOK_DOTS on next token. That or
switching PARSE_FLAG_ASM_FILE off for preprocessor directives in .S files.

Will have to write a failing test too.

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Sergey Korshunoff
Sent: Tuesday, March 29, 2016 8:55 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

Hi!

Patch  "Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE"
breaks preprocessing on *.S
A test program (from a linux-2.4.26)

#define SRC(y...)   \
: y;\
.section __ex_table, "a";   \
.long b, 6001f  ;   \
.previous
SRC(1:  movw (%esi), %bx)
// 029-test.S:7: error: macro 'SRC' used with too many args //
commit_bad=aa1ed616eb01efe353e7c5e829fffbed01b428bd
//Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE
//
// commit_good=17395ea5070bb05681f93ce7a8019c8c863a607b

This test compules OK if changes commut_good..commit_bad are reverted.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-29 Thread Sergey Korshunoff
Hi!

Patch  "Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE"
breaks preprocessing on *.S
A test program (from a linux-2.4.26)

#define SRC(y...)   \
: y;\
.section __ex_table, "a";   \
.long b, 6001f  ;   \
.previous
SRC(1:  movw (%esi), %bx)
// 029-test.S:7: error: macro 'SRC' used with too many args
// commit_bad=aa1ed616eb01efe353e7c5e829fffbed01b428bd
//Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE
//
// commit_good=17395ea5070bb05681f93ce7a8019c8c863a607b

This test compules OK if changes commut_good..commit_bad are reverted.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-24 Thread Michael Matz
Hi,

On Thu, 24 Mar 2016, Vladimir Vissoultchev wrote:

> Thanks for the effort! Apologies for not finding the time to rollback to
> prev impl, had this on my todo, I swear :-))

No matter; had found a couple of minutes under my desk just now myself ;)


Ciao,
Michael.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-24 Thread Vladimir Vissoultchev
Thanks for the effort! Apologies for not finding the time to rollback to
prev impl, had this on my todo, I swear :-))

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Michael Matz
Sent: Thursday, March 24, 2016 5:04 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

Hi,

On Thu, 17 Mar 2016, Vladimir Vissoultchev wrote:

> About the TOK_DOTS saga -- yes, the original code was using PEEKC 
> canonically, I'm not sure about the unget hack it implemented though.

It is, as the comment written next to it explained.

I fixed the parsing of ..\n. now in mob with a testcase.


Ciao,
Michael.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-24 Thread Michael Matz
Hi,

On Thu, 17 Mar 2016, Vladimir Vissoultchev wrote:

> About the TOK_DOTS saga -- yes, the original code was using PEEKC 
> canonically, I'm not sure about the unget hack it implemented though. 

It is, as the comment written next to it explained.

I fixed the parsing of ..\n. now in mob with a testcase.


Ciao,
Michael.

___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-20 Thread Michael Matz

Hi,

welcome to TCC development :)

On Mon, 14 Mar 2016, Vladimir Vissoultchev wrote:


> A symbol is one or more characters chosen from the set of all letters
(both upper and lower case), digits and the three characters _.$. No

> symbol may begin with a digit. Case is significant. There is no length
limit: all characters are significant. Symbols are delimited by

> characters not in that set, or by the beginning of a file (since the
source program must end with a newline, the end of a file is not a

> possible symbol delimiter). See Symbols.

So it seems labels can indeed start with and contain dots. Am I correct in
understanding this text?


Yes, GAS labels.  There's no universal convention for assemblers.  Being 
compatible with GAS does make sense (when easily possible), so yeah, such 
change seems appropriate.


Also, what is the polite way to commit in mob branch? Do you practice 
sending patches to the list beforehand so that anyone can chime in with 
problems spotted?


No formal process exists, but if you're a new developer sending patches 
beforehand would be appreciated, especially so for new features, because 
the feature might not even be wanted (or in a different form).


I'm sorry my first commits were out of nowhere and then had to revert 
some rogue changes that broke some tests. Now I have the tests working 
under MinGW.


Some comments on some of those patches (such comments are also easier when 
the patch was in a mail :) ):


 case TOK_CLDOUBLE:
cstr_cat(_buf, "");
+cstr_ccat(_buf, '\0');

You made it such that most cstr_cat calls (except two and those in 
i386-asm.c) are now followed by cstr_ccat(0).  Consider adding a 
cstr_catz() routine that does both.


+/* keep structs lvalue by transforming `(expr ? a : b)` to `*(expr~
+   that `(expr ? a : b).mem` does not error  with "lvalue expected~
+islv = (vtop->r & VT_LVAL) && (sv.r & VT_LVAL) && VT_STRUCT == (ty~
+

Please respect a 80 characters per line limit.  It's not always currently 
respected, but we shouldn't introduce new over long lines.


Also, this specific change could use a testcase to not regress in the 
future.


-} else if (c == '.') {
-PEEKC(c, p);
-if (c == '.') {
-p++;
-tok = TOK_DOTS;
-} else {
-*--p = '.'; /* may underflow into file->unget[] */
-tok = '.';
-}
+} else if ((isidnum_table['.' - CH_EOF] & IS_ID) != 0) { /* asm mode */
+*--p = c = '.';
+goto parse_ident_fast;
+} else if (c == '.' && p[1] == '.') {
+p += 2;
+tok = TOK_DOTS;

As you removed the PEEKC call you mishandle quoted line-endings.  For 
instance the following decl is conforming and hence the ..\\n. must be 
parsed as one token, TOK_DOTS:


int f (int a, ..\
.);

This feature could also do with a testcase.

One more remark about future git commit messages: please follow the usual 
git convention.  From git-commit(1):


   Though not required, it’s a good idea to begin the commit message with
   a single short (less than 50 character) line summarizing the change,
   followed by a blank line and then a more thorough description. The text
   up to the first blank line in a commit message is treated as the commit
   title, and that title is used throughout git.


Ciao,
Michael.___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


Re: [Tinycc-devel] GAS symbols

2016-03-19 Thread Vladimir Vissoultchev
Hi,

> You made it such that most cstr_cat calls (except two and those in
> i386-asm.c) are now followed by cstr_ccat(0).  Consider adding a
> cstr_catz() routine that does both.

It's not obvious that cstr_cat does *not* terminate the target buffer so I
stepped on this landmine and out of curiosity checked all the places it was
used. That's when I spotted someone missed it too (in cases that seem to be
left for debug purposes only).

Thought about implementing a benign null-temrination just ahead of last
character but the implementation would be ugly as it has to cstr_ccat(0) for
the auto-resize to kick in (if needed) and then unget the '\0'.

About the TOK_DOTS saga -- yes, the original code was using PEEKC
canonically, I'm not sure about the unget hack it implemented though. Will
have to write a testcase and fix it as currently it's not ok.

As I see it, will try to use github branches and diffs along with the commit
messages for list approval before pushing to mob.

Thanks for the feedback and your code-review!

cheers,


-Original Message-
From: tinycc-devel-bounces+wqweto=gmail@nongnu.org
[mailto:tinycc-devel-bounces+wqweto=gmail@nongnu.org] On Behalf Of
Michael Matz
Sent: Wednesday, March 16, 2016 10:36 PM
To: tinycc-devel@nongnu.org
Subject: Re: [Tinycc-devel] GAS symbols

Hi,

welcome to TCC development :)

On Mon, 14 Mar 2016, Vladimir Vissoultchev wrote:

> > A symbol is one or more characters chosen from the set of all 
> > letters
> (both upper and lower case), digits and the three characters _.$. No
> 
> > symbol may begin with a digit. Case is significant. There is no 
> > length
> limit: all characters are significant. Symbols are delimited by
> 
> > characters not in that set, or by the beginning of a file (since the
> source program must end with a newline, the end of a file is not a
> 
> > possible symbol delimiter). See Symbols.
> 
> So it seems labels can indeed start with and contain dots. Am I 
> correct in understanding this text?

Yes, GAS labels.  There's no universal convention for assemblers.  Being
compatible with GAS does make sense (when easily possible), so yeah, such
change seems appropriate.

> Also, what is the polite way to commit in mob branch? Do you practice 
> sending patches to the list beforehand so that anyone can chime in 
> with problems spotted?

No formal process exists, but if you're a new developer sending patches
beforehand would be appreciated, especially so for new features, because the
feature might not even be wanted (or in a different form).

> I'm sorry my first commits were out of nowhere and then had to revert 
> some rogue changes that broke some tests. Now I have the tests working 
> under MinGW.

Some comments on some of those patches (such comments are also easier when
the patch was in a mail :) ):

  case TOK_CLDOUBLE:
 cstr_cat(_buf, "");
+cstr_ccat(_buf, '\0');

You made it such that most cstr_cat calls (except two and those in
i386-asm.c) are now followed by cstr_ccat(0).  Consider adding a
cstr_catz() routine that does both.

+/* keep structs lvalue by transforming `(expr ? a : b)` to
`*(expr~
+   that `(expr ? a : b).mem` does not error  with "lvalue
expected~
+islv = (vtop->r & VT_LVAL) && (sv.r & VT_LVAL) && VT_STRUCT 
+ == (ty~
+

Please respect a 80 characters per line limit.  It's not always currently
respected, but we shouldn't introduce new over long lines.

Also, this specific change could use a testcase to not regress in the
future.

-} else if (c == '.') {
-PEEKC(c, p);
-if (c == '.') {
-p++;
-tok = TOK_DOTS;
-} else {
-*--p = '.'; /* may underflow into file->unget[] */
-tok = '.';
-}
+} else if ((isidnum_table['.' - CH_EOF] & IS_ID) != 0) { /* asm
mode */
+*--p = c = '.';
+goto parse_ident_fast;
+} else if (c == '.' && p[1] == '.') {
+p += 2;
+tok = TOK_DOTS;

As you removed the PEEKC call you mishandle quoted line-endings.  For
instance the following decl is conforming and hence the ..\\n. must be
parsed as one token, TOK_DOTS:

int f (int a, ..\
.);

This feature could also do with a testcase.

One more remark about future git commit messages: please follow the usual
git convention.  From git-commit(1):

Though not required, it?s a good idea to begin the commit message
with
a single short (less than 50 character) line summarizing the change,
followed by a blank line and then a more thorough description. The
text
up to the first blank line in a commit message is treated as the
commit
title, and that title is used throughout git.


Ciao,
Michael.


___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


[Tinycc-devel] GAS symbols

2016-03-14 Thread Vladimir Vissoultchev
Hi,

 

I am trying to compile something very similar to `gcc -S` output (like in
this question
 ) and noticed that TCC does not support labels starting with a dot
('.')

 

Reading GAS documentation   I
noticed that they had the following text for valid identifiers:

 

> Symbols

> 

> A symbol is one or more characters chosen from the set of all letters
(both upper and lower case), digits and the three characters _.$. No 

> symbol may begin with a digit. Case is significant. There is no length
limit: all characters are significant. Symbols are delimited by 

> characters not in that set, or by the beginning of a file (since the
source program must end with a newline, the end of a file is not a 

> possible symbol delimiter). See Symbols.

 

So it seems labels can indeed start with and contain dots. Am I correct in
understanding this text?

 

Also, what is the polite way to commit in mob branch? Do you practice
sending patches to the list beforehand so that anyone can chime in with
problems spotted?

 

I'm sorry my first commits were out of nowhere and then had to revert some
rogue changes that broke some tests. Now I have the tests working under
MinGW.

 

cheers,



___
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel