Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)

David Brown Sun, 31 Jul 2016 08:17:10 -0700

On 29/07/16 18:26, Warren D Smith wrote:

Booleans are very useful - they turn up all over the place in programming.


Nibbles, on the other hand, are almost totally useless.  There are very,
very few situations where you need to store a number that is within the
range 0 .. 15, and are so tightly constrained for space that you can't
use an 8-bit byte (or whatever size of byte your target uses).


--Why does any programmer want to have a large
good-quality-code-producing C compiler like gcc, as opposed to the
crudest possible (but working) translator?
It is for performance.  If they did not care about performance, and
only cared about correctness, they would not want gcc, they'd want the
simpler smaller compiler, since more likely to be correct.

There are many reasons why people choose gcc. Personally, I choose itbecause (in no specific order) :

1. Code correctness is top-class. It is very rare that gcc producesincorrect code, and correctness bugs get high priority from thedevelopers. This has not always been my experience with commercial tools.

2. Portability - I work with dozens of different target processors, andhaving the same compiler for them all is a huge benefit.

3. Tool structure - I like the way the compiler and associated toolswork, how well they fit together, and the kinds of files I can generate(map files, listing files, dependency files, a range of different finalobject files).

4. Features - gcc is the leading compiler in terms of support for C andC++ features, useful extensions, static error checking, and otherfeatures. I've used many compilers through the years - gcc's only rivalis clang (and that rivalry is often cooperative, and seems to me to be agood thing for both projects).

5. Optimisation - because gcc has a lot of sophisticated optimisations,I can concentrate on writing code that is clear, correct, andmaintainable, and rely on the compiler to turn it into efficient objectcode.

6. Freedom - it's not so much the cost (I have paid for gcc packages),but the freedom to install and use whatever versions I want for whatevertargets I want on whatever host computers I want, without worrying aboutlicenses, time limits, locking to particular hosts, etc.

7. Documentation - there is lots of documentation for gcc, and (at leastin comparison to many other compilers) it is accurate and kept up-to-date.


8. Support - there is great support for the compiler from many places.

Notice that the quality and the efficiency of the generated code is oneof these points - but only one of them.

Now why do
programmers care about performance? Two reasons:
memory space, and time.   If programs are not going to come anywhere near
memory capacity or time-capacity, then nobody needs gcc, and I have to wonder
why you work on it at all.

OK? So the sole rationale for the existence of large
good-quality-code-producing C compilers like gcc, is for space and/or
time performance.

Now, suppose you are a memory-performance programmer.  In that case,
you are pushing
memory capacity limits.  Therefore, you often want to use the smallest available
size for each information-packet in your program, and you want them packed.

It happens that power-2 sizes (1,2,4,8,16,... bits) are more convenient, fairly
small in number, and assure optimality to within a factor of 2.  So as
a first step, I would recommend any compiler that aims to satisfy
space-performance programmers, should provide power-2 sizes, with
packing.

No, that is simply unnecessary. C provides enough facilities to handlethis in a C-like manner. Other languages (such as Pascal or C++) mayhave different support. But supported packed uint4_t (or other sizesthat are not multiples of CHAR_BIT) cannot be done in C in the way itworks for types that /are/ a multiple of CHAR_BIT. That's life with C.If you don't like it, use a different language.


(A more ambitious step would be all sizes, power-2 or not.)

Now suppose you are a time-performance programmer.  In that case, you
also care about memory because of the cache.  I have done tests on my
computer indicating
that random memory accesses cost a factor about 70 more time, if array
is too large.
That means space performance is time performance.

OK?  Now it is arrogant and obnoxious for somebody to tell me certain
info-packet
sizes (like 4) are useless while others (like 1 and 32) are useful,

Yes - that's based on reality. It turns out that it is rare for typesof sizes 2 to 7 bits to be useful in large enough quantity that it isworth thinking about packing them inside arrays.

And it is not that hard to make your own access functions to pack themand access them as needed - while it is impossible to implement them inC in a manner that is consistent with other types. See Jonathan's postfor a very clear explanation.

just since they feel that way, and the main reason they feel that way
is they designed gcc to make it hard for anybody to use size 4, thus
causing fewer people to use size 4, thus causing mis-assessment 4 is
less useful.  The truth is,
all sizes are useful, and power2 sizes provide access to all sizes (to
within a factor 2).


No one "designed gcc to make it hard to use 4 bit types".


But anyhow since you have just agreed size 1 (bools) are useful, I
suggest ability to
pack them is useful.   E.g. sieve of Eratosthenes, algorithm developed over 2000
years ago and in a large fraction of all programming textbooks, wants
huge arrays of packed bools.   It is possible all those authors were
idiots, and then I suggest you write letters to them saying so.

I agree that packing 1-bit bools in arrays can sometimes be useful. ButI don't agree that making "bool a[24];" be packed in C would be a usefulthing - it would break too many key features of C (such as the abilityto take the address of an element of an array).

C++ allows more flexibility in the language - thus you /can/ havestd::vector<bool> being a packed array. But in C, these things need tobe cleaner and simpler.

And it is not hard to make packed boolean arrays in C. You can look atthe Linux kernel's bitmap code for inspiration:


<http://lxr.free-electrons.com/source/include/linux/bitmap.h>

Now, I am sure that the gcc developers have already thought about makingsome sort of attribute syntax for an extension such as:


bool a[24] __attribute__((packed_bitarray));

This would impose special rules, such as it being illegal to take theaddress of elements.

The developers would balance the benefits and uses of such a featurewith the complications involved in implementing it, the disadvantages ofintroducing another non-standard extension to C, the ease with whichprogrammers can work without having such an extension, and thedevelopment time and priorities compared to other gcc features andimprovements. My guess is that it is simply not worth the effort -especially since C++ already has the necessary support. And while Idon't know about the internals of gcc, my guess is, I think, better thanyours.

SUMMARY:
Every programmer for which gcc matters (as opposed to simplest C compiler)
wants space performance.  That is provided by packing and by power-2 datachunk
sizes.

"Every" = Large enough demand for you?

A series of incorrect or exaggerated statements followed by a huge leapdoes not make a conclusive argument.

A boolean is a flag, and is a common basic type of data - just like a
character, or an integer.  A nibble is a particular constrained size of
integer that fits badly with most hardware, is unsupported by almost any
programming language, and has little real-world use.


--nybbles fit excellently with all hardware I've ever used.

You don't know how your hardware works. Give me some references orlinks to processors that can take the address of an individual nibble,or have instructions to load or store individual nibbles.

C already supports "enum" types, which in a large fraction of practical uses,
are nybbles.  Either these are packed nybbles, or not.  If are, then you've
already done the coding for it.  If not, providing enums in this way was
inappropriate for performance language.

enum types are implemented as standard C types of the next size (i.e.,signed/unsigned char, short, int, long or long long - whatever is thesmallest type that has a big enough range). They are not packed.

On the very rare
occasion where you want to use 4 bits but can't use 8, you can use a
bitfield in C - or write appropriate access functions with shifts and
masks.


--Then your code will be nonuniform treatment for 4 vs 8, and will be
larger and buggier,
for no reason.

Code and the compiler /cannot/ treat packet 4-bit numbers in the sameway as 8-bit (or multiples of CHAR_BIT) - no hardware supports it. Butif smaller data - such as _Bool - is stored as chars, then it /can/ and/is/ treated in a uniform way.

Multiplying you by 100,000 that proves if you put it in GCC,
you'd save 200,000 programmer hours, which is equivalent to saving
over 2 lives.


That claim is absurd in every way imaginable.


--I was unaware that "multiplication" was absurd.


No, what is absurd is that you think that having a "mul" builtin would
save me time, and that you can pick a number out of the air to guess how
many people this applies to, and that you can get a total "programmer
hours saved" result.  And what is even more absurd is that you claim
this "saves over 2 lives"!


--ok, maybe I am an old fogy, but in my day, essentially every programmer
as a student had to implement some multiprecision arithmetic as homework.
Are there 100,000 such students in the world? Easily.
And multiprecision arithmetic was not an "incredibly rare" idea. It is
so common and important that textbooks explain how to do it (Knuth) as
a large fraction
of said textbook. It is so common that many languages provide it, e.g. Scheme,
built in.  That is how enormous the demand is for the notion of "integers."
That is why hardware provides x*x-->2x wide multiply instructions.

I wonder why I bother writing anything - it does not seem that you arereading what I write. But let me try to explain /again/.

Programmers need to use numbers in their code. Often these areintegers. Usually these are quite small numbers - maybe up to a fewthousand. Occasionally, they will be big numbers - perhaps millions oreven a thousand million. And very, very occasionally they will be more,say up to a million, million, million. And then there are specialisedprogrammers dealing with things like number theory, encryption systems,etc., that need integers with no limits on their sizes.

Those that need unlimited numbers need to have their multiprecisionarithmetic routines. They already /have/ them. They will use libraries(like gmp) that use some C, and some target-specific assembly in orderto use all the features the hardware can provide to get the fastestspeed. They are not interested in compiler library calls that dodouble-size multiplication - that would not be big enough. (They will,I expect, use existing gcc extensions such as the builtin_XXX_overflowinstructions).

For the great majority of uses, 32-bit integers were fine. For thegreat majority of those that need more than 32-bit, but don't needunlimited sizes, the 64-bit "long long" from C99 is sufficient. And ifthere are any left, then the __int128 support in gcc, while nottechnically a full extended integer type, is going to work fine.

So no one needs a "mul" library call because gcc already supports allthe integer multiplication they will need - unless they need unlimitedsized integers, in which case "mul" would not help.


Now it is simply not possible to implement multiprecision arithmetic
efficiently in strict C.  That is why your gnu pals who wrote gmp library, had
to write machine language patches for different hardware, thus
essentially writing their own ad hoc crude mini-compiler to overcome
the limitations of gcc.  If people have to write their own compilers
to do basic every-student stuff, then that, I put to you, is evidence
that gcc, an actual compiler, should provide this.

OK? I am not pulling  numbers out of thin air.  I am
making very conservative estimates of numbers, which you yourself
easily could have made (I just explained how).   My lives counts are
very conservative, not ridiculous.

No, the numbers /are/ pulled out of thin air. And even if they werecompletely accurate, the "saves lives" claim is absurd.

Again, you are completely wrong.  Double-length multiplies are very
rarely necessary,


--by "very rarely" you mean that they happen every day in a large fraction
of all scheme programs.

I can't answer for scheme - I don't use it. But any C99 compiler willlet me multiply integers up to a result of about 18 million millionmillion, which is fine for most people's daily use. And gcc will (on64-bit platforms, which account for most modern PC's) happily deal withmultiplies up to 340 million million million million million million.

If I want to use bigger numbers than that, but don't want to use gmp (orsimilar languages), then I will write the code I need in a half hour orso. Or perhaps I will use C++ - I am sure that there will be classesfor types like "uint_t<256>" available easily.

--you guys keep saying stuff based on zero evidence.

And your evidence for the contrary is... nothing at all.


--sigh. I have just tried to explain such evidence quantitatively.  It
is hard to me to understand how you can be so ignorant that you could
not have come up with those numbers yourself and had to make me do it.

The gcc folks, on the other hand, have lots of information about what
users want.  They talk to a wide range of users, they have mailing
lists, forums, conventions, meetings, wish-lists, etc.  They know what
people want and need from the compiler.  And if there is a feature that
is easy to implement, and would be useful to lots of people, then they
implement it.  It is that simple.


--no, it is more like this.  People tell you what they want. You tell
them they are idiots and ignore them and discourage them from telling
you what they want.  They you say "nobody is saying they want this, I
can't hear you!"  This has happened to me in absolutely every case I
can think of of me telling any compiler writer what I want.  I have
never seen any example of somebody tells them what they want, then the
writer agrees and does it.
In my entire life.   Now obviously such examples exist.  They are just
so rare they lie
outside the experience of almost every programmer.
And it pisses us off when we are treated this way 100% of the time.

You seem to be misunderstanding something here. I am a gcc /user/, nota gcc /developer/.

And as a /user/, I have asked compiler developers for features. I havediscussed features with developers, and seen these features implemented.On mailing lists like this one, I have seen it happen many times. AndI have seen people (including me) ask for features, and learn from theensuing discussion why it is impossible or at least highly unlikely forthose features to be implemented - and I have learned from that.

But I have also seen unpleasant and thoughtless people like you makedemands and insults, wild claims about what "everyone" "obviously" wantsand what is "clearly easy to implement". People such as that always goaway disappointed - regardless of whether the idea was good or not.

Gcc added partial support for uint128's as posters earlier on this
thread claimed in order
to prove I was an idiot.


No, gcc added partial support for 128-bit integers long ago, because it
was useful to some people.  It is very rare that it is useful - the
people who write code that has use of 128-bit integers are doing rather
specialised work.  The partial support along with the other builtin
functions in gcc do what they need.


--sigh.  If you are going to add u128 partial support, and you
yourself already said it is
not very hard to program u128 using only u64's, and you already said
you yourself did
write such code, then I have to wonder why it is that you
intentionally did an incomplete
job with u128s in gcc.


/I/ did not do anything with 128-bit integers in gcc.

But if you understood the C standards (have you ever even looked atthem?) you would understand that there is quite a lot involved in makinga /complete/ extended integer type in C. gcc implements the most usefuland important aspects of 128 bit integers on targets that easily supportit - but they have not made full support because it would be a lot ofeffort (also including work for C standard libraries, not just thecompiler) for little useful return.

   All you had to do was put that "not hard"
code into gcc to handle
the architectures you did not want to actually code the machine code
for, and voila -- complete support on every architecture (slower on
some than it needs to be, but always works).  And then portable code
could be written, for a refreshing change.

I don't know off-hand if current versions of gcc support 128 bitintegers on 32-bit platforms. Such implementation is somewhattarget-dependent, so perhaps it supports some and not others. But I doknow that 128-bit integers are not often needed, and that people writingthe kinds of code that make use of such integers would be using 64-bitcompilers. So I suspect it is simply not a big issue for anyone -compiler user or compiler developer.

If you read what people wrote, and perhaps applied a little thought and
common sense rather than jumping to arguments, ignorance, arrogance and
insults, then perhaps /you/ would learn something.


--sigh. I have already documented instances of the opposite.  In fact,
see the preceding paragraph.

--then it would be 3-10 times slower.


unsigned __int128 mult128(uint64_t a, uint64_t b) {
   return (unsigned __int128) a * b;
}


--this is illegal code as far as my machine and gcc compiler are concerned.

Then switch to a current version of gcc before complaining about missingfeatures of gcc.

On 64-bit gcc, that compiles to:

mult128(unsigned long, unsigned long):
         movq    %rdi, %rax
         mulq    %rsi
         ret

Tell me again how that is so slow?


--that's fine.

I believe you haven't the faintest idea what effort is involved in
adding a feature to the compiler.  Even assuming that the basic
implementation of a "multiply with carry" function was simple, there is
testing, documentation, automatic test suites, validation on
combinations of some 30+ targets, 10+ hosts, and countless selections of
compiler options, and so on.  And that is after the discussions about
how it will be used, how it interacts with other functions, how it may
conflict with future standards, and coordinating with other projects
such as clang.  gcc is a /big/ project, and it is a high quality project
- it is not a one-man hobby program where you can just throw in a new
feature with a few lines of code.


--seems to me, there must be a big list of builtin functions, each
coming with machine code
implementations.   You just add another item to the list.

so your stance is, those previous library/standards developers were
idiots?


Eh, no.  My stance is that div() has existed since the earliest versions
of the C standards, so it must be supported in current compilers too.
It is usually unnecessary to use div() in code that you write /today/,
because modern compilers do a fine job of optimising arithmetic
functions.  And if one were to design a new C-like programming language
/today/, it is unlikely that a "div" function would be part of the
standard library.  But it was a different matter forty years ago or so
when C was first conceived.  So the "div" function is mainly historical.


--aha. So you admit historical errors were made?

Of course historical errors were made! And - more relevantly here -design decisions were made in the young days of C that do not match theuses of C now, forty years later. This should not come as a surprise -the surprise is in how successful C has been throughout that time, andhow it remains successful.

(Like... wordsize arbitrary choices that got there purely by
historical accident...
and by accident got utterly stupid names like "unsigned long long"
which now have become somewhat better names like "uint64_t")

Well, I can't answer for what anyone may have said 15 years ago or
whatever.  It certainly seems a long time to hold a grudge from a
difference of opinions.


--it isn't so much a "grudge" as "acquired understanding of the plain fact that
every compiler writer I ever encountered, is a total bullshit artist,
and therefore,
I am no longer going to be intimidated by their assertions of their
great expertise
and my idiocy as an answer to everything."

--I did not see this in the docs, sorry.  I did see a doc claiming
specifically that it was not available in gcc -- but there was some
experimental way to obtain it, which the gcc developers had
intentionally chosen not to put in the official code, for reasons they
refused to say.  And no directions whatever were given about how this
unsupported experimental
thing was to be used.


The warning flag "-Warray-bounds" is enabled by the common warnings
"-Wall".


--I still am confused.  I have been using -Wall in my gcc compilations for
decades.  And yet, gcc has never compiled array bounds checking into
my C programs in all those decades.

-Warray-bounds is a relatively new feature (I can't remember the exactgcc version it appeared in). gcc makes steady progress towards betterstatic error checking.

But as I said, this is for /static/ error checking - not run-timechecking. That means it does not add extra check code into yourrun-time, but points out problems that can be identified at compile time.

And for any other use where array bounds checking would be costly at
run-time, you have to implement it yourself in C.  C arrays maximise
efficiency and simplicity at the cost of safety.


--And that is why I suggested this as a compiler OPTION:
gcc -ArrayBoundsChecksPlease progname.c


There are two options:

        -fsanitize=bounds

This is a "sanitizer", using an extra run-time library, working on a
range of systems, but with extra code size and speed costs.

        -fcheck-pointer-bounds

This is efficient, flexible and sophisticated, with a good deal of
user-level control - but it requires hardware support that is currently
only available on a few of Intel's cpus.


--sounds exciting.  None of that works on my machine.


Again, you need a newer gcc to use the newer features.

Googling "gcc -fsanitize" finds nothing.
I did find
https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Pointer-Bounds-Checker-builtins.html#Pointer-Bounds-Checker-builtins
which seems ultra-ugly syntax, but for those who can
get past that, plausibly VERY useful.
Then in
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc.pdf
I finally found discussion of fsanitize=address and fsanitize=otherstuff, and
that seems EXCELLENT.  I think this is going to change my life.


Well, that's marvellous.

And perhaps next time, you will read the documentation for the compilerbefore ranting about all the features the compiler does not have.

Also profiling.

gcc has supported profiling for decades, I believe.


--not really, far as I can tell.   If a compiler were to, in each
"basic block" of the code, insert
counters, then after the run of the code, the counters were to be
output, that would be genuine profiling, accomplished by the compiler,
not by some external tool.


gcc does this, as far as I can know (I haven't read the details for a
long time), but at a function level.  There is no clear definition of a
"basic block" to use (since C99), and any attempt to add profiling code
to something like every loop or conditional would cripple the optimiser
and totally dominate the run time - you'd get profile data that bore
virtually no relation to the real timings of the normal optimised code.


--well, it quite plausibly would heavily damage timing data, but it
would still get
COUNT data for basic blocks 100% correct.  Such counting
is a different kind of profiling than time measuring, but both are useful.
Further, the damage to timing data would be partially correctable
by some simple data post-processing.

Back in the old days, when compilers were little more than C-to-assembly
translators, this might have made sense.  Basic block profiling was
possible with the old Turbo Pascal tools, for example - helped by Pascal
being a simpler language with a clearer definition of a basic block.


--Jon Bentley in one of his programming pearls works, advocated that every
programmer write their own profiler.  This can be accomplished in C, more or
less, by basically #defining a large set of ugly macros that redefine
the C language
as your personal language.  E.g.  if(a){ b }
is replaced by some macro like  MyIf(a,b).
Then you write program in your personal language, which is just like C
only uglier.
Then, you set a flag which activates or turns off profiling-code
generation by the macros.
Voila.

OK, now let us think.  Every programmer should do this, says luminary
Jon Bentley.
Even though it is a disgusting pain.

If so, I think it is obvious that compilers should be able to do it
for you so it isn't such
a huge pain.

I can't say I am much impressed with this idea. I certainly don't thinkthis feature is something compilers should do just because Jon Bentleysays so.

There is an additional tool gprof that is made by the gcc folks, that
analyses the profile file and can be used for feedback-directed
optimisation (i.e., run the program to see which functions are used a
lot and which are the bottlenecks, and put more emphasis on optimising
them in the next run).

(Counters could count how many times each basic block executed, and/or
time
elapsed within each, under control via user compiler flags and/or
pragmas allowing
more precise control, e.g. only do it for this part of the code, not that
part.)
This would be very very very easy for a compiler to do.


Again, you don't have the slightest concept of what would be "easy for a
compiler to do".


--well, I just explained how to do it without a compiler, so...

If something can be done within the language as currently supported bythe compiler, or it can be done by changing the language and thecompiler, then it is almost always /far/ easier to do it in the languagerather than in the compiler.

Instrumenting /function/ entry and exit is relatively straightforward,
and gcc has many options and function attributes to support that.

However, since modern optimisation involves a good deal of
re-arrangement of functions (inlining, cloning, combining, partial
inlining, link-time optimisation, run-time selection of functions,
etc.), even function-grain profiling is of limited value.

If a COMPILER puts stuff like that (or arrays bounds checks) into the
code,
that is way easier and better, than if, instead, some external tool,
fighting
against the compiler to overcome its limitations, tries to do it, or
partially do it.
If, nevertheless, people are so desperate to get some functionality
that they develop such external tools ANYWAY, then that to me is clear
proof there was a large enough need, that it ought to have been
offered by the compiler as an option.   That way that functionality
would be available better and easier with no need for that external
tool.

I'd like elsif.


#define elsif else if


I thought it was clear that this was a joke solution to a non-existent
problem.

--sigh. A well known flaw in C, making bugs more likely for no good
reason,
is the "else if ambiguity."  It is mentioned in lots of
basic programming texts.  Here there is even a wikipedia page on it:
    https://en.wikipedia.org/wiki/Dangling_else#C
As it says this has been recognized since 1960.
A well known cure for that is to have an elsif
instruction.  Your "fix" fails to fix this problem.


No, the cure for "dangling else" is the same as the cure for most of the
problems with "if" (and "while", "for", and "do") - always use brackets.

if (a) {
        ...
} else if (b) {
        ...
} else {
        ...
}

See?  No problem.


--Sigh.  We'd diminish such problems by providing elsif.
Or, we could just say "you programmers are idiots who need to teach yourself
good habits."

Yes, that is /exactly/ what we should say. C is a dangerous language -there is no end to the number of mistakes you can make if you don't havegood programming habits.

Sure that'd work, except for all the beginner
programmers who tripped on this same banana peel over and over,
totally avoidably.

This is one banana peel that is totally avoidable. And C programmersshould be taught to avoid it.

Why not help eliminate such
banana peels?  And note, C already provides #elsif.  Why is that?
It is just historical accident. The guy who write the preprocessor was
aware of and
cared about this problem, the different guy who wrote the main C
wasn't or didn't.
Again, whenever you contradict yourself, we know you were wrong somewhere.

The preprocessor language and the C language are two very differenttypes of language, for very different purposes. There is no contradiction.

In my opinion, the C language designers made a mistake in allowing "if"
statements without the brackets.


--I agree with that opinion.
But even if they were demanded, it'd still be less convenient than if
elsif were provided, because of avoidable multiple nesting needs.


Brackets work fine.

  I'd like more sophisticated compile time stuff, like right now
they have #if, #elsif, #endif.  Ok, why not #for?  That way we could
unroll
loops by "doing the loop at compile time" not runtime.  (Need to make
a language subset intentionally weakened to not be turing complete,
i.e.
we
want to know for sure the compile always will terminate, but still
precompiler language could be a good deal more powerful than now.) I
could discuss that.
I'd like a compile-time language giving you a fair amount of power, but
below turing-power, and
acting as though it were sensibly designed in from start with similar
syntax
(within reason)  to the actual runtime language -- not intentionally
different syntax for no reason aside from trying to annoy people, and
not an obvious crude add-on.


You'll find C++ next door.  It is a marvellous language, and includes
all of the features you ask for here.  And with a little practice (or
googling for libraries) you can also get your 128 bit integers, safe
strings, and everything else you are keen on.


--Sigh.
Actually, C++ is fundamentally flawed in this precise respect, namely,
its "precompiler language" is Turing complete, making it Turing
undecidable whether any given C++ program will even compile, and, e.g.
there are C++ programs which will compile if and only if the Riemann
hypothesis is false.


So what?  Just write C++ programs that /do/ complete their compilation.
  Or if you accidentally write an infinite loop, break the compiler and
fix the code.


--oh god.  Well, this attitude is just  beyond the pale for me.

If you write crap code, you get crap results. If you write good code,you get good results. So the trick to getting good results, is to writegood code. And if it turns out that you accidentally wrote crap, thenstop what you are doing and fix it.

That seems like a good general attitude to many things, not justprogramming.

Do you know anything much about computability, and the halting problem?


--yes.

  It is impossible to make a program that can confirm that any given
program in a Turing complete system will halt.  This means you cannot
write a software tester that will take any program and determine whether
or not it will stop.  And since C++ templates are Turing complete, there
is no way to write a compiler that will take a C++ program and determin
if it is valid or not.

But is this a problem?

No, of course it is not a problem.  It is totally irrelevant to software
development.  (Long compile times, which sometimes plague C++ programs,
is a problem - one that compiler developers work hard to improve.)  It
is also solveable by simply adding limits to template recursion depth
(compilers either have a fixed limit, or command-line switches).


--that'd still make it at least superexponential.


See my paragraph above.

There are /lots/ of systems that are Turing complete, or which canresult in unreasonably long times or infinite loops.


A cookbook could contain the instructions:

        1. Put the spaghetti in a put.
        2. Add some cold water.
        3. Wait until it boils.

Step 3 is an infinite loop - without putting the pan on the cooker, youare stuck. Does that mean the whole idea of publishing cookery books isfundamentally flawed?

If you are a mathematician, you probably use LaTeX. It is Turingcomplete - it is not hard to write a document that will never finishparsing with LaTeX. Does that mean LaTeX is broken?

But I'm confused
about whether this is really offered... my confusion is because
I am reading this doc dated 2011
    https://gcc.gnu.org/wiki/FunctionSpecificOpt
where it says "Stage1: Syntax for optimization option using attributes"
and it seems to give the impression its author thinks this would be a good
idea,
but that it hasn't happened inside gcc yet; and
     https://gcc.gnu.org/onlinedocs/gcc/Pragmas.html
does not seem to mention this.  I would suggest editing this latter page
to add
"optimization control" as a new section, if this exists.
And re the earlier page, I fully agree something of this ilk would be
a fine idea.


Function-specific optimisation is possible, and has its uses - but it
also has its complications and limitations.  If you declare that one
function should be unoptimised (say, for debugging purposes) with -O0,
and another function should be optimised for top speed with -O3, what
should the compiler do if the -O3 function calls the -O0 function inside
a loop?  What if it is inling the function?  Or what if you change
options that may affect correctness of code (such as strict aliasing
options)?  What happens if you are doing link-time optimisation, and
have several identical definitions of the functions (this is legal in
C++ and is a common result of templates), but have different options in
different files?

The general advice is therefore that you /can/ do function specific
optimisation, but it is rare that you /should/ do it.  And you /can/ do
it with pragmas (for people who like pragmas), but it is better to use
function attributes.  It is better to either let the compiler figure out
which functions need extra speed (it's getting quite good at it), or use
more generic function attributes like "hot" and "cold".


--sounds sensible.

Incidentally, another annoyance in C which it'd be nice to fix, and
not hard to fix at
all, would be this:
    int foo( int a, int b, int c, int d, double e ){
       code
    }
is silly.  It would be better if we could write it as
    int foo( int a,b,c,d; double e ){
    }
getting rid of the extra type names which just waste space and serve
no useful function.


I can't say that this bothers me at all.  And it just opens new ways to
cause confusion like:

        int foo(char* p, q);

Is "q" a "char" or a "char*" ?


--a char* (or rather, it should be, in my view).

And that would be consistent with an obvious visual parsing of the line- and /inconsistent/ with the way C parses "char* p, q;".

Allowing "char* p, q;" for variable declarations is a horrible idea inthe first place, IMHO - and allowing it inside function declarationsjust makes things worse.

Anyhow I think this particular ambiguity is already present in C
for variable declarations, no?
I mean, my gripe is that variable declarations can be concise, but
in function argument lists they cannot be, which is annoyingly inconsistent
and also hits you where it hurts.

And I think conciseness is important.  Not so much for saving
keystrokes but rather for reading + comprehending + checking code
faster.

Making reading, comprehending and checking code more /accurate/ isimportant. Making it faster would be nice, but it is not nearly asuseful. The above feature shows clearly that parsing "foo(char* p, q)"has a high chance of a mistake - it does not matter how quickly thatmistake is made.

Sometimes accuracy is helped by conciseness - but often conciseness is ahinder.

One feature of function declaration or calling that could be added to C
to improve it (i.e., help people write clear, correct, safe, robust code
- rather than helping people save a keystroke or two) it would be
designated parameters.  i.e., after the declaration:

int foo(int a, int b, int c, int d, double e);

you could call foo like this:

        foo(1, 2, 3, .d = 5, .e = 2.5);
        foo(.d = 5, .a = 1, .c = 3, .e = 2.5, .b = 2);

        // Alternative syntax
        foo(d: 5, a: 1, c: 3, e: 2.5, b: 2);



and get the same effect.


--yes, that sort of feature would be nice, ada has some stuff like that.


Many languages do - it is a good idea.

--well, I have some understanding of these things.


Every object in C needs to have a unique address.  It is impossible to
have the unique address of an object smaller than a "char", and the
minimum size of a char is 8 bits.  (A "_Bool" typically has 8 bits too -
1 value bit, and 7 padding bits.)  This is fundamental to C.


--if a char has an address, then the address of a bit inside that char would be
got by appending 3 more bits.


Then it is no longer an address as far as the hardware is concerned.

Also, even if C were to implement packed bools but refused to allow
pointers to them, that still would be a big improvement.

It is conceivable, but I doubt if it would be considered a "big"improvement. Perhaps when C was young, it might have been a nice idea -but not now.

Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)

Reply via email to