Re: [Tinycc-devel] { .field = value } struct initialization alternative
Hello, On 21 Jul 2019, at 21:37, Ivan Medoedov mailto:ivan.medoe...@gmail.com>> wrote: No, it's a compilation error (error: field expected), and I can't reproduce it in a small program. The original file has hundreds of thousands of loc. Large inputs that cause unwarranted compilation errors are easy to make small by using C-Reduce: https://embed.cs.utah.edu/creduce/ (probably already a package of your reputable Linux distribution). C-Reduce takes as argument a script that indicates that a variant of the initial input is still interesting. In your case, the input is interesting because it is accepted by (say) “gcc -std=c11 -pedantic” and when compiled with “tcc”, causes stderr to contain “error: field expected”. It takes a bit of getting used to to wield C-Reduce efficiently; for instance not specifying “-std=c11 -pedantic” or not grepping specifically for “error: field expected” is likely to yield a different bug than the one that was initially bothering you, that C-Reduce will introduce as it reduces the input. John Regehr compares it to “every story where a genie is in a position to grant a wish”. Even with the quirks that manifest themselves the first couple of times you use it, C-Reduce still helps getting bugs fixed faster than reducing inputs by hand or not reporting bugs properly because reducing them looks daunting. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
Hello, > On 24 Jun 2019, at 12:05, grischka wrote: > > Pascal Cuoq wrote: > > int t[][3]; [ error: unknown type size ] > > Actually that is not wrong. It's the same as > > extern int t[][3]; Ah yes. Anyway my patch did not change the behavior for this example. I wrote the test intending “int t[3][];”, I think, and got it wrong. ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
> On 24 Jun 2019, at 05:01, Michael Matz wrote: > > diff --git a/tccgen.c b/tccgen.c > index c65cf76..84e0953 100644 > --- a/tccgen.c > +++ b/tccgen.c > @@ -4507,6 +4507,8 @@ static int post_type(CType *type, AttributeDef *ad, int > storage, int td) > post_type(type, ad, storage, 0); > if (type->t == VT_FUNC) > tcc_error("declaration of an array of functions"); > +else if ((type->t & VT_BTYPE) == VT_VOID || type_size(type, ) < 0) > +tcc_error("wrong!"); > t1 |= type->t & VT_VLA; > > if (t1 & VT_VLA) { As you wish, but I still intend to fix the “arrays of const functions” bug at the same time. Also I declared a different variable align_unused because, if I used the existing variable align, I wouldn't want the next person reading this function, who might be me, to wonder whether the value from this call is used on purpose later. Pushed to mob. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
Hello, > On 23 Jun 2019, at 23:18, Michael Matz wrote: > > The patch definitely goes into the right direction, though it seem more > verbose than necessary. I'd just test for functions or incomplete types (via > type_size), and then you have the opportunity to retain the more precise > error message for the former, ala > > if (func) >tcc_error ... > else if (type_size < 0) >tcc_error ... > okay ... I considered this but: - the single generic error message seemed consistent with a compiler which advertises its small size and its speed, and which was accepting these programs until recently, - but even producing a generic error message, functions would have to remain as a separate case because type_size returns 1 for them, - and also type_size returns 1 for void, so that would have to be another separate case if we wish to reject arrays of void. Two separate cases mean we have to compute type->t & VT_BTYPE. If we call type_size it will recompute it, and store an alignment that we do not require to a variable that we will have to declare. It doesn't look like it's shorter or faster. If you think that this much code should be put in a function, that might be useful elsewhere, I can do that, but when compiling GCC's dialect of C where pointer arithmetic is legal for pointers to void and function pointers, “being a type with a size in the sense of pointer arithmetic” and “being a complete type” become different enough to require two different functions. ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
On 22 Jun 2019, at 21:16, Michael Matz mailto:matz@frakked.de>> wrote: struct s { char a; enum e b[]; } s; struct t { int a[3]; void b[]; } t; typedef void u; struct v { int a[3]; u b[]; } v; struct w { int a[3]; struct n b[]; } w; The thing to realize about all these invalid examples is, that it's not the struct decl which is wrong, i.e. you don't need to change anything within struct_decl or struct_layout. It's already the array declarator itself which is wrong: an array declarator requires a complete element type. So, what you want to change is post_type (which cares for array and function declarators, given a base type) so that the incoming type is complete if necessary. So, pushing the previous stuff down the stack and investigating post_type, it is currently making a minimal effort to reject arrays of functions, and I tried to insert a better filter at the place where the check was (see attached patch). The new version continues to reject this example: typedef void f(void); f t[3]; (with a more generic message) The new version also rejects all my previous examples quoted above, and the following, which is currently accepted because type->t is not masked in the current implementation: const f t[3]; If this patch can be tweaked into something acceptable, I will also add tests for the new rejected constructs and validate the message change for the existing test. Pascal arrays_of_incomplete.patch Description: arrays_of_incomplete.patch ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] undefined sanitizer
On 22 Jun 2019, at 22:29, Vincent Lefevre mailto:vinc...@vinc17.net>> wrote: On 2019-06-22 20:59:57 +0200, Michael Matz wrote: Indeed. The thing is that such "mis"-alignment isn't generically undefined behaviour (and hence shouldn't even be part of -fsanitize=undefined). It's implementation defined what it means for a pointer to an object type to be correctly aligned (e.g. one where the natural alignment of all types is 1 is fully conforming). Accessing something via an incorrectly aligned pointer is undefined, but what incorrectly aligned means is implementation defined. Yes, it's implementation defined, but I assume that -fsanitize=undefined warns only when the implementation has decided that this was incorrectly aligned. Probably everyone has already seen this blog post about GCC generating code that crashes if pointers to uint32_t are not aligned to 4, but I will post the URL just in case: http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
Hello Michael, and thanks for the guidance. On 22 Jun 2019, at 01:17, Michael Matz mailto:matz@frakked.de>> wrote: Yes, there are generally two contexts, and in one of them (e.g. decls with initializers) incomplete types are temporarily valid. So you'd either need two modes of type_size (one complaining), or two functions (or, as now, checking the size sometimes). If you want to invest more work than just adding a check in classify_x86_64_arg, instead add a function ctype_size (c for complete) which complains on incomplete types, and use it in places where the code really needs complete types (and doesn't yet check on its own). That is a big can of worm you have pointed me to. Here is another part of the code that seems wrong and continues to seem wrong even with the suggested change: static void struct_layout(CType *type, AttributeDef *ad) { int size, align, maxalign, offset, c, bit_pos, bit_size; int packed, a, bt, prevbt, prev_bit_size; int pcc = !tcc_state->ms_bitfields; int pragma_pack = *tcc_state->pack_stack_ptr; Sym *f; maxalign = 1; offset = 0; c = 0; bit_pos = 0; prevbt = VT_STRUCT; /* make it never match */ prev_bit_size = 0; //#define BF_DEBUG for (f = type->ref->next; f; f = f->next) { if (f->type.t & VT_BITFIELD) bit_size = BIT_SIZE(f->type.t); else bit_size = -1; // call type_size here, because t->type can be incomplete // if it is a flexible array member size = type_size(>type, ); a = f->a.aligned ? 1 << (f->a.aligned - 1) : 0; packed = 0; if (pcc && bit_size == 0) { /* in pcc mode, packing does not affect zero-width bitfields */ } else { /* in pcc mode, attribute packed overrides if set. */ if (pcc && (f->a.packed || ad->a.packed)) align = packed = 1; /* pragma pack overrides align if lesser and packs bitfields always */ if (pragma_pack) { packed = 1; if (pragma_pack < align) align = pragma_pack; align starts its life as an automatic, uninitialized variable. At each iteration, the call to type_size sets it unless the call fails and leaves align's previous value in it. My only change so far in this function is the comment “call type_size here, because t->type can be incomplete if it is a flexible array member”: I stand by this comment, because calling ctype_size here makes TCC abort while compiling pcctest.c. Next if pcc is false and pragma_pack is true, the value of align is used in if (pragma_pack < align) For a flexible array member of a complete element type, this currently works out fine : size_type stores the element type's alignment even if the array is a FAM. Now consider a FAM of an incomplete enum: struct s { char a; enum e b[]; } s; Fortunately TCC rejects structs that do not have at least one member of a complete type before the FAM, but still, in the above example the value of align that is used in if (pragma_pack < align) is the value from the previous iteration, that is, the alignment from the previous member, and that seems awfully wrong (the program should simply be rejected before reaching that point). To make a long story short, as the most pressing improvement to clarify the behavior of TCC with incomplete types, I find myself trying to make type_size reject each of following struct declarations: $ cat t.c struct s { char a; enum e b[]; } s; struct t { int a[3]; void b[]; } t; typedef void u; struct v { int a[3]; u b[]; } v; struct w { int a[3]; struct n b[]; } w; int printf(const char *, ...); int main(void) { printf("stringlit: %zu\n", sizeof "abcd"); printf("%zu\n", sizeof s); printf("%p\n", ); printf("%p\n", ); printf("%zu\n", sizeof(u)); printf("%zu\n", sizeof t); printf("%zu\n", sizeof v); printf("%zu\n", sizeof w); } pascal@TrustInSoft-Box-VII:~/tinycc_mob$ ./tcc -run t.c stringlit: 5 1 0xfd6778 0xfd6779 1 12 12 0 And the changes I currently have for this purpose are the following, but they are not sufficient to make TCC reject the declaration of w of type struct w ST_FUNC int type_size(CType *type, int *a) { Sym *s; @@ -2786,11 +2798,14 @@ ST_FUNC int type_size(CType *type, int *a) int ts; s = type->ref; +if ((s->type.t & VT_BTYPE) == VT_VOID) +tcc_error("array of void"); ts = type_size(>type, a); - -if (ts < 0 && s->c < 0) +if (ts < 0 && s->c < 0) { +if (IS_ENUM(s->type.t)) +tcc_error("array of incomplete enum"); ts = -ts; - +} return ts * s->c; } else { *a = PTR_SIZE; ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org
Re: [Tinycc-devel] match formats and arguments exactly
On 21 Jun 2019, at 16:54, ian mailto:sibian0...@gmail.com>> wrote: Anyway, I don't think it's desirable that kinds of pointers are checked. Should I emphasize that I am not offering to implement a warning in TCC? If you think that GCC should not emit the warning shown by Christian, you can either not enable this GCC warning when you use GCC, or file a bug with GCC developers so that they won't warn for this construct with -Wall, or not use GCC. But this GCC warning entered the discussion only because Christian replied without reading the patch, and it has nothing to do with the patch I sent. Again: I do not want to warn about wrong printf arguments. I am not adding a warning about wrong printf arguments to TCC. I hope this clears things up. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] match formats and arguments exactly
On 21 Jun 2019, at 16:10, ian mailto:sibian0...@gmail.com>> wrote: Hello,IMHO, considering that flexibility is what I love in C programming, and that this checking should be printf job (in that case), Unfortunately, this is not how printf, or other variadic functions, work. The way they work is: the non-variadic arguments (in the case of printf, the format string) indicate what variadic arguments should be consumed with what type. If the types of the arguments actually passed do not match the types indicated by the non-variadic arguments, the behavior is undefined. Not only printf, and other variadic functions, have no obligation to warn you if you misuse them, but on every existing platform (including the exotic platforms where a pointer is not a pointer), they actually have no way to warn you that you are misusing them. ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] match formats and arguments exactly
On 21 Jun 2019, at 15:33, Christian Jullien mailto:eli...@orange.fr>> wrote: If I read you correctly, you want to protest if type does not strictly match format directive. I'm not protesting, I'm just offering a patch (in a series of patches) that makes the TCC source code free of easily avoided undefined behavior. I admit I'm not very interested in knowing whether GCC warns or not for the things I am fixing, because GCC neither warns for everything that's undefined nor limits its warnings to things that are undefined. The C standards are my reference, in this case, https://port70.net/~nsz/c/c11/n1570.html#7.21.6.1p8 , which says: 7.21.6.1:8 o,u,x,X The unsigned int argument is … s If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type. … 7.21.6.1:9 … If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined. I was asking about the patch because I have encountered developers who prefer to keep their invalid pointer arithmetic, uses of uninitialized memory and misuse of printf, in which case I don't insist. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] match formats and arguments exactly
> On 21 Jun 2019, at 14:56, Christian Jullien wrote: > > This is a valuable check but IMHO, it should be controlled by -Wformat (as > GNU gcc) and set of false by default. > Otherwise, I suspect tcc users will have a lot a new warnings. I'm not implementing a new warning in TCC. I am only ensuring, by following C11 7.21.6.1:8 to the letter, that the TCC source code passes the strictest such checks that a C compiler could have, and also that a very exotic C compiler does not produce a non-functional binary when compiling TCC. Considering the amount of code that good warnings represent, I think that a C compiler can either be tiny, or provide helpful warnings. The patches I have been sending, including the last one, only make TCC not exhibit undefined behavior, a more manageable goal that only requires small changes and do not make TCC significantly larger. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] match formats and arguments exactly
Hello, If no-one objects, I will push in a few days the following patch, which ensures that the types of arguments to printing functions correspond exactly to their formatters (%x expects an unsigned int, %s expects a pointer to character type, etc.). Pascal formatting.patch Description: formatting.patch ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] missing check after calling type_size in classify_x86_64_arg
Hello, the function type_size can fail and return -1 for an incomplete enum: https://repo.or.cz/tinycc.git/blob/944fe7036c53613889deb66cb9d03da2407d6c85:/tccgen.c#l2800 In this case it leaves *a untouched. When this happens when called from the function classify_x86_64_arg, it leads to using the automatic variable align uninitialized: https://repo.or.cz/tinycc.git/blob/944fe7036c53613889deb66cb9d03da2407d6c85:/x86_64-gen.c#l1142 This scenario happens for some inputs files. I expect all inputs files that cause this to be invalid C programs, but a compiler that emits an error on invalid inputs is better than a compiler that displays undefined behavior on invalid inputs. An example of an input file causing execution to go through classify_x86_64_arg with type_size returning -1 is the following: enum t f(int x) { while(1); } I was thinking of inserting a check like “if (size < 0) tcc_error("incomplete enum");” after the call to type_size in classify_x86_64_arg. The function type_size is called from a lot of places so I didn't even consider making it abort directly instead, but if someone suggests it might be better I can look into it. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] Avoid allocating a VLA of size zero
Hello, > On 16 Jun 2019, at 21:49, Michael Matz wrote: > > On Wed, 12 Jun 2019, Petr Skočík wrote: > >> I think it looks a bit better without the extra variable: >> >> char _onstack[nb_args?nb_args:1], *onstack = _onstack; > > Agreed, if you put spaces around the ? and : ;) So I pushed that patch as my first commit. If the git is broken now it's Petr's fault for pointing out that anyone can push. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] undefined sanitizer
> On 18 Jun 2019, at 09:54, Mike wrote: > > I keep having fun. > In attach compile report under -fsanitize=undefined in gcc or clang. I should warn you that your recent e-mails are about identical to a previous message from January: https://lists.nongnu.org/archive/html/tinycc-devel/2019-01/msg00093.html , which did not lead to any code change. Obviously what is lacking is the time to investigate in depth what should be done about these warnings. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] "internal compiler error: vstack leak" and crash with VLA of incomplete type
With TCC from git (commit 9382a3a), the two following inputs vla0.i and vla1.i each cause the message “error: internal compiler error: vstack leak” to be printed. In addition, the input vla1.i makes TCC crash: $ cat vla0.i int X=1; int main(void) { int t[][X]; } $ ./tcc vla0.i about to pop: 1 about to pop: 0 about to pop: 0 vla0.i:6: error: internal compiler error: vstack leak (-1) $ cat vla1.i int X=1; int main(void) { int t[][][X]; } $ ./tcc vla1.i about to pop: 1 about to pop: 0 about to pop: -1 about to pop: -1 vla1.i:6: error: internal compiler error: vstack leak (104364) Segmentation fault $ The message “about to pop” is caused by the attached patch, which does not change the functional behavior of TCC. If someone more knowledgeable than me about TCC's internals wants to continue from where I stopped, according to the tools I am using, the first undefined behavior to occur when TCC is processing either of these inputs is inside the function vpop, where the pointer vtop is made to point before the array it is supposed to point to. The callstacks at the point of this first undefined behavior respectively look like (the lines may not correspond exactly): stack: vpop :: tccgen.c:4524 <- post_type :: tccgen.c:4608 <- type_decl :: tccgen.c:7512 <- decl0 :: tccgen.c:7697 <- decl :: tccgen.c:6197 <- block :: tccgen.c:7375 <- gen_function :: tccgen.c:7596 <- decl0 :: tccgen.c:7697 <- decl :: tccgen.c:298 <- tccgen_compile :: libtcc.c:652 <- tcc_compile :: libtcc.c:1068 <- tcc_add_file_internal :: libtcc.c:1094 <- tcc_add_file :: tcc.c:338 <- main stack: vpop :: tccgen.c:4524 <- post_type :: tccgen.c:4507 <- post_type :: tccgen.c:4608 <- type_decl :: tccgen.c:7512 <- decl0 :: tccgen.c:7697 <- decl :: tccgen.c:6197 <- block :: tccgen.c:7375 <- gen_function :: tccgen.c:7596 <- decl0 :: tccgen.c:7697 <- decl :: tccgen.c:298 <- tccgen_compile :: libtcc.c:652 <- tcc_compile :: libtcc.c:1068 <- tcc_add_file_internal :: libtcc.c:1094 <- tcc_add_file :: tcc.c:338 <- main The attached patch, when applied, shows the undefined behavior occurring: the message “about to pop” is printed just before executing vtop--. When the message shows 0, vtop is about to go outside the array __vstack. When it shows -1, vtop is already outside the array. It does not make much sense to try to debug anything that happens after this, since vtop going outside of __vstack is already something that should not happen, and since it can mess up whatever data structure is stored next to it. pop.patch Description: pop.patch ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] unbounded recursion and crash on invalid nested union declaration
Hello, On the following (invalid) C program, tcc goes into an unbounded recursion, and crashes when it exhausts the space allowed for the stack. I tried to make a valid C program with the same property, which would be a more interesting bug report, but I couldn't find one. Pascal $ cat cr.i union u { union u { }; }; $ gcc -c cr.i cr.i:2:9: error: nested redefinition of 'union u' union u { }; ^ cr.i:2:14: warning: declaration does not declare anything union u { }; ^ $ clang -c cr.i cr.i:2:9: error: nested redefinition of 'u' union u { }; ^ cr.i:1:7: note: previous definition is here union u { ^ 1 error generated. $ ./tcc -c cr.i Segmentation fault $ ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] TCC crash with incorrect syntax in compound literal
Hello again, On 17 Mar 2019, at 13:56, Pascal Cuoq mailto:c...@trust-in-soft.com>> wrote: 2) I noticed that a variant of the input I initially reported in this thread is still crashing TCC. The variant that still crashes TCC as of commit d72b877 is: void f(char *); void g(void) { f((char[]){1, ,}); } It is very similar to the previous input but this time there is a first element before the syntactically incorrect “, ,” in the compound literal. The attached patch makes the program above rejected without crash, does not seem to break “make test” (I think I know what it looks like when the tests are broken because my first fix did break them), and it makes the following program accepted too (and compiled to a binary that behaves as intended): #include int main(void) { printf("%zu %s\n", sizeof (char[]){'a', 0,}, (char[]){'a', 0,}); } It might be close to the right fix. Pascal comma_comma.patch Description: comma_comma.patch ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] TCC crash with incorrect syntax in compound literal
Hello, Thanks again Matthias, and also to Michael who fixed the other crashes I had posted. I was going over the “interesting” inputs generated by fuzzing again, and: 1) I found a bug in the tool I was using TCC as a reference implementation for, so the bugs in TCC being fixed is really useful to me. 2) I noticed that a variant of the input I initially reported in this thread is still crashing TCC. The variant that still crashes TCC as of commit d72b877 is: void f(char *); void g(void) { f((char[]){1, ,}); } It is very similar to the previous input but this time there is a first element before the syntactically incorrect “, ,” in the compound literal. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] Use of uninitalized automatic variable in TCC when parsing “int f ( int ( )”
Finally, adding one more bit of instrumentation shows that TCC can crash because of the uninitialized variable being discussed in this thread. On 08 Mar 2019, at 20:06, Pascal Cuoq mailto:c...@trust-in-soft.com>> wrote: the simplest way to make this problem visible is to instrument the functions type_decl and post_type: diff --git a/tccgen.c b/tccgen.c index 87ec798..7fa6c72 100644 --- a/tccgen.c +++ b/tccgen.c @@ -4374,7 +4374,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) Sym **plast, *s, *first; AttributeDef ad1; CType pt; - +n = 0xf00f0011; if (tok == '(') { /* function type, or recursive declarator (return if so) */ next(); @@ -4410,6 +4410,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) } convert_parameter_type(); arg_size += (type_size(, ) + PTR_SIZE - 1) / PTR_SIZE; +if (n == 0xf00f0011) printf("using n uninitialized\n"); s = sym_push(n | SYM_FIELD, , 0, 0); *plast = s; plast = >next; @@ -4583,7 +4584,7 @@ static CType *type_decl(CType *type, AttributeDef *ad, int *v, int td) parse_attribute(ad); post = type_decl(type, ad, v, td); skip(')'); - } + } else printf("*v left uninitialized\n"); } else if (tok >= TOK_IDENT && (td & TYPE_DIRECT)) { /* type identifier */ *v = tok; The function post_type declares an automatic variable n and does not initialize it. Setting it to 0xf00f0011 allows to see that it has not been assigned when it is used later in this function (ored with SYM_FIELD and passed as argument to the function sym_push). When “using n uninitialized” is printed in the instrumented version of TCC, it means that n would have been used uninitialized in the uninstrumented version of the compiler. To understand the crash, add the following patch to the instrumentation that was already discussed: diff --git a/tccgen.c b/tccgen.c index 87ec798..ee5a838 100644 --- a/tccgen.c +++ b/tccgen.c @@ -588,6 +588,10 @@ ST_FUNC Sym *sym_push(int v, CType *type, int r, int c) /* XXX: simplify */ if (!(v & SYM_FIELD) && (v & ~SYM_STRUCT) < SYM_FIRST_ANOM) { /* record symbol in token array */ +if (v == 0xd00f0011) { + printf("v < 0, this will not go well\n"); + fflush(stdout); +} ts = table_ident[(v & ~SYM_STRUCT) - TOK_IDENT]; if (v & SYM_STRUCT) ps = >sym_struct; This patch shows that the value 0xf00f0011, that was been chosen as the value of the uninitialized variable n in the function type_decl, can for some inputs be propagated until it is used to compute an address. $ cat c.i int f(int ()) { return 0; } $ ./tcc -c c.i *v left uninitialized using n uninitialized v < 0, this will not go well Segmentation fault The crash may be more difficult to observe without the first part of the instrumentation, but regardless the root cause for it, when it happens, is the fact that the automatic variable n from the function type_decl is sometimes incorporated into computations without having been set beforehand. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] Use of uninitalized automatic variable in TCC when parsing “int f ( int ( )”
Update: I have found an input that is accepted by GCC, accepted by Clang, and that makes TCC use the variable n uninitialized in function post_type. On 08 Mar 2019, at 20:06, Pascal Cuoq mailto:c...@trust-in-soft.com>> wrote: the simplest way to make this problem visible is to instrument the functions type_decl and post_type: diff --git a/tccgen.c b/tccgen.c index 87ec798..7fa6c72 100644 --- a/tccgen.c +++ b/tccgen.c @@ -4374,7 +4374,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) Sym **plast, *s, *first; AttributeDef ad1; CType pt; - +n = 0xf00f0011; if (tok == '(') { /* function type, or recursive declarator (return if so) */ next(); @@ -4410,6 +4410,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) } convert_parameter_type(); arg_size += (type_size(, ) + PTR_SIZE - 1) / PTR_SIZE; +if (n == 0xf00f0011) printf("using n uninitialized\n"); s = sym_push(n | SYM_FIELD, , 0, 0); *plast = s; plast = >next; @@ -4583,7 +4584,7 @@ static CType *type_decl(CType *type, AttributeDef *ad, int *v, int td) parse_attribute(ad); post = type_decl(type, ad, v, td); skip(')'); - } + } else printf("*v left uninitialized\n"); } else if (tok >= TOK_IDENT && (td & TYPE_DIRECT)) { /* type identifier */ *v = tok; The function post_type declares an automatic variable n and does not initialize it. Setting it to 0xf00f0011 allows to see that it has not been assigned when it is used later in this function (ored with SYM_FIELD and passed as argument to the function sym_push). When “using n uninitialized” is printed in the instrumented version of TCC, it means that n would have been used uninitialized in the uninstrumented version of the compiler. I have not found any syntactically correct input that caused “*v left uninitialized” to be printed but not “using n uninitialized”, so a solution *might* be to make TCC error out at the point where I made it print out “*v left uninitialized”, but this is for someone with better understanding of the code than me to decide. A better input for demonstrating the problem (valid C compilation unit) is as follows: $ cat cr.i int f(const char *()); $ clang -Wall -c cr.i $ gcc -Wall -c cr.i $ ./tcc -c cr.i *v left uninitialized using n uninitialized ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] Use of uninitalized automatic variable in TCC when parsing “int f ( int ( )”
Hello, the simplest way to make this problem visible is to instrument the functions type_decl and post_type: diff --git a/tccgen.c b/tccgen.c index 87ec798..7fa6c72 100644 --- a/tccgen.c +++ b/tccgen.c @@ -4374,7 +4374,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) Sym **plast, *s, *first; AttributeDef ad1; CType pt; - +n = 0xf00f0011; if (tok == '(') { /* function type, or recursive declarator (return if so) */ next(); @@ -4410,6 +4410,7 @@ static int post_type(CType *type, AttributeDef *ad, int storage, int td) } convert_parameter_type(); arg_size += (type_size(, ) + PTR_SIZE - 1) / PTR_SIZE; +if (n == 0xf00f0011) printf("using n uninitialized\n"); s = sym_push(n | SYM_FIELD, , 0, 0); *plast = s; plast = >next; @@ -4583,7 +4584,7 @@ static CType *type_decl(CType *type, AttributeDef *ad, int *v, int td) parse_attribute(ad); post = type_decl(type, ad, v, td); skip(')'); - } + } else printf("*v left uninitialized\n"); } else if (tok >= TOK_IDENT && (td & TYPE_DIRECT)) { /* type identifier */ *v = tok; The function post_type declares an automatic variable n and does not initialize it. Setting it to 0xf00f0011 allows to see that it has not been assigned when it is used later in this function (ored with SYM_FIELD and passed as argument to the function sym_push). When “using n uninitialized” is printed in the instrumented version of TCC, it means that n would have been used uninitialized in the uninstrumented version of the compiler. $ cat cr.i int f ( int ( ) $ ./tcc cr.i *v left uninitialized using n uninitialized cr.i:2: error: ',' expected (got "") Some inputs cause “*v left uninitialized” to be printed but not “using n uninitialized”: $ cat cr.i int f(void) { ( int ( ) $ ./tcc cr.i *v left uninitialized cr.i:2: error: ')' expected (got "") I have not found any syntactically correct input that caused “*v left uninitialized” to be printed but not “using n uninitialized”, so a solution *might* be to make TCC error out at the point where I made it print out “*v left uninitialized”, but this is for someone with better understanding of the code than me to decide. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] Segmentation fault when using “extern i;” to access homonym variable from inside “for (int i;...”
Hello, The input below crashes TCC for me on Ubuntu 16.04 on x86-64. In order to be certain to observe the problem, it can help to temporarily add a debug printf call inside the function elfsym: $ git diff diff --git a/tccgen.c b/tccgen.c index 87ec798..cbc6b09 100644 --- a/tccgen.c +++ b/tccgen.c @@ -308,6 +308,7 @@ ST_FUNC ElfSym *elfsym(Sym *s) { if (!s || !s->c) return NULL; + printf("s->c %d is about to be used as an offset.\n", s->c); return &((ElfSym *)symtab_section->data)[s->c]; } The problematic input is as follows. Note that in this case this is a well-formed compilation unit: $ cat extern_local.i int main(void) { char a[50]; for (int i;;) { extern i; i++; } } Compiling with the instrumented TCC prints: $ ./tcc extern_local.i s->c 26 is about to be used as an offset. s->c -56 is about to be used as an offset. s->c 26 is about to be used as an offset. The value of s->c being used as an offset, it is wrong that it's negative. On my machine, -56 is not enough to cause a crash, but the value of s->c is linked to the size of the unused array a. I can make TCC crash if I use a larger size: $ cat extern_local.i int main(void) { char a[50]; for (int i;;) { extern i; i++; } } $ ./tcc extern_local.i s->c 26 is about to be used as an offset. s->c -54 is about to be used as an offset. Segmentation fault ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] Assertion failed when compiling p + &[i]
Hello, With both the most current git commit (b082659) and version 0.9.27, on Ubuntu 16.04 on x86-64, when compiling the provided input addrof_addrof.i, TCC stops abruptly with the message below: $ cat addrof_addrof.i void f(void) { int i; char *p; p + &[i]; } $ ./tcc addrof_addrof.i tcc: x86_64-gen.c:441: load: Assertion `((ft & VT_BTYPE) == VT_INT) || ((ft & VT_BTYPE) == VT_LLONG) || ((ft & VT_BTYPE) == VT_PTR) || ((ft & VT_BTYPE) == VT_FUNC)' failed. Aborted I believe that it is interpreting & as the address of a label, a GCC extension to ease the implementation of interpreters and, according to https://bellard.org/tcc/tcc-doc.html , supported by TCC. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Re: [Tinycc-devel] TCC crash with incorrect syntax in compound literal
Hello, > On 06 Mar 2019, at 18:56, uso ewin wrote: > > I've just push a patch that should fix this issue, Many thanks for looking into this! > I hope I didn't add a now bug All the bugs I have so far are in syntactically incorrect inputs, so it would be a valid choice not to risk introducing more harmful bugs by leaving in the ones I report. On the other hand, being more robust, if it isn't incompatible with TCC's other objective, can only make it more useful. I will report a failed assertion in another thread. Pascal ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] Crash after "warning: storage mismatch for redefinition of 'L' "
In both version 0.9.25 and today's git version (commit d27ea5155f47cb0c29699ef959b52b087dfff41a), tcc crashes while compiling the following one-line invalid program: enum myenum { L = -1 } L; tcc crashes after emitting an error message: pascal@TrustInSoft-Box-VII:~/tcc-2019-02-07$ bin/tcc -c enum.i enum.i:1: warning: storage mismatch for redefinition of 'L' Erreur de segmentation GCC and Clang also consider the program as invalid, and emit the following messages: pascal@TrustInSoft-Box-VII:~/tcc-2019-02-07$ clang -c enum.i enum.i:1:24: error: redefinition of 'L' as different kind of symbol enum myenum { L = -1 } L; ^ enum.i:1:15: note: previous definition is here enum myenum { L = -1 } L; ^ 1 error generated. pascal@TrustInSoft-Box-VII:~/tcc-2019-02-07$ gcc -c enum.i enum.i:1:24: error: 'L' redeclared as different kind of symbol enum myenum { L = -1 } L; ^ enum.i:1:15: note: previous definition of 'L' was here enum myenum { L = -1 } L; ^ ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel
[Tinycc-devel] TCC crash with incorrect syntax in compound literal
Hello, as a side-product of working on something else, I found that TCC 0.9.27 (x86_64 Linux) crashes for me on the following program: pascal@TrustInSoft-Box-VII:~/tcc-bin$ cat crash.i void f(char*); void g(void) { f((char[]){,}); } pascal@TrustInSoft-Box-VII:~/tcc-bin$ bin/tcc crash.i crash.i:4: warning: assignment makes integer from pointer without a cast crash.i:4: warning: nonportable conversion from pointer to char/short crash.i:4: warning: assignment from incompatible pointer type Erreur de segmentation The program crash.i is of course syntactically incorrect: pascal@TrustInSoft-Box-VII:~/tcc-bin$ clang -c crash.i crash.i:4:14: error: expected expression f((char[]){,}); ^ 1 error generated. pascal@TrustInSoft-Box-VII:~/tcc-bin$ gcc -c crash.i crash.i: In function 'g': crash.i:4:14: error: expected expression before ',' token f((char[]){,}); However crash.i is close enough to a program that someone may accidentally write that TCC developers may be interested in making TCC reject it gracefully. The page https://bellard.org/tcc/tcc-doc.html#ISOC99-extensions lists compound literals as supported, so the program even has invalid syntax while trying to use a feature that TCC supports. Should I report this crash, or any other TCC crash that I will find in the future, to http://savannah.nongnu.org/bugs/?group=tinycc ? I should emphasize that I have no interest in compiling this or future programs with TCC. There is no urgency to these reports, and I can also not do them at all if there is no interest in ensuring that TCC does not crash on mechanically produced input files. ___ Tinycc-devel mailing list Tinycc-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/tinycc-devel