> Booleans are very useful - they turn up all over the place in programming. > > Nibbles, on the other hand, are almost totally useless. There are very, > very few situations where you need to store a number that is within the > range 0 .. 15, and are so tightly constrained for space that you can't > use an 8-bit byte (or whatever size of byte your target uses).
--Why does any programmer want to have a large good-quality-code-producing C compiler like gcc, as opposed to the crudest possible (but working) translator? It is for performance. If they did not care about performance, and only cared about correctness, they would not want gcc, they'd want the simpler smaller compiler, since more likely to be correct. Now why do programmers care about performance? Two reasons: memory space, and time. If programs are not going to come anywhere near memory capacity or time-capacity, then nobody needs gcc, and I have to wonder why you work on it at all. OK? So the sole rationale for the existence of large good-quality-code-producing C compilers like gcc, is for space and/or time performance. Now, suppose you are a memory-performance programmer. In that case, you are pushing memory capacity limits. Therefore, you often want to use the smallest available size for each information-packet in your program, and you want them packed. It happens that power-2 sizes (1,2,4,8,16,... bits) are more convenient, fairly small in number, and assure optimality to within a factor of 2. So as a first step, I would recommend any compiler that aims to satisfy space-performance programmers, should provide power-2 sizes, with packing. (A more ambitious step would be all sizes, power-2 or not.) Now suppose you are a time-performance programmer. In that case, you also care about memory because of the cache. I have done tests on my computer indicating that random memory accesses cost a factor about 70 more time, if array is too large. That means space performance is time performance. OK? Now it is arrogant and obnoxious for somebody to tell me certain info-packet sizes (like 4) are useless while others (like 1 and 32) are useful, just since they feel that way, and the main reason they feel that way is they designed gcc to make it hard for anybody to use size 4, thus causing fewer people to use size 4, thus causing mis-assessment 4 is less useful. The truth is, all sizes are useful, and power2 sizes provide access to all sizes (to within a factor 2). But anyhow since you have just agreed size 1 (bools) are useful, I suggest ability to pack them is useful. E.g. sieve of Eratosthenes, algorithm developed over 2000 years ago and in a large fraction of all programming textbooks, wants huge arrays of packed bools. It is possible all those authors were idiots, and then I suggest you write letters to them saying so. SUMMARY: Every programmer for which gcc matters (as opposed to simplest C compiler) wants space performance. That is provided by packing and by power-2 datachunk sizes. "Every" = Large enough demand for you? > A boolean is a flag, and is a common basic type of data - just like a > character, or an integer. A nibble is a particular constrained size of > integer that fits badly with most hardware, is unsupported by almost any > programming language, and has little real-world use. --nybbles fit excellently with all hardware I've ever used. C already supports "enum" types, which in a large fraction of practical uses, are nybbles. Either these are packed nybbles, or not. If are, then you've already done the coding for it. If not, providing enums in this way was inappropriate for performance language. > On the very rare > occasion where you want to use 4 bits but can't use 8, you can use a > bitfield in C - or write appropriate access functions with shifts and > masks. --Then your code will be nonuniform treatment for 4 vs 8, and will be larger and buggier, for no reason. >>>> Multiplying you by 100,000 that proves if you put it in GCC, >>>> you'd save 200,000 programmer hours, which is equivalent to saving >>>> over 2 lives. >>> >>> That claim is absurd in every way imaginable. >> >> --I was unaware that "multiplication" was absurd. > > No, what is absurd is that you think that having a "mul" builtin would > save me time, and that you can pick a number out of the air to guess how > many people this applies to, and that you can get a total "programmer > hours saved" result. And what is even more absurd is that you claim > this "saves over 2 lives"! --ok, maybe I am an old fogy, but in my day, essentially every programmer as a student had to implement some multiprecision arithmetic as homework. Are there 100,000 such students in the world? Easily. And multiprecision arithmetic was not an "incredibly rare" idea. It is so common and important that textbooks explain how to do it (Knuth) as a large fraction of said textbook. It is so common that many languages provide it, e.g. Scheme, built in. That is how enormous the demand is for the notion of "integers." That is why hardware provides x*x-->2x wide multiply instructions. Now it is simply not possible to implement multiprecision arithmetic efficiently in strict C. That is why your gnu pals who wrote gmp library, had to write machine language patches for different hardware, thus essentially writing their own ad hoc crude mini-compiler to overcome the limitations of gcc. If people have to write their own compilers to do basic every-student stuff, then that, I put to you, is evidence that gcc, an actual compiler, should provide this. OK? I am not pulling numbers out of thin air. I am making very conservative estimates of numbers, which you yourself easily could have made (I just explained how). My lives counts are very conservative, not ridiculous. >>> Again, you are completely wrong. Double-length multiplies are very >>> rarely necessary, --by "very rarely" you mean that they happen every day in a large fraction of all scheme programs. >> --you guys keep saying stuff based on zero evidence. > And your evidence for the contrary is... nothing at all. --sigh. I have just tried to explain such evidence quantitatively. It is hard to me to understand how you can be so ignorant that you could not have come up with those numbers yourself and had to make me do it. > The gcc folks, on the other hand, have lots of information about what > users want. They talk to a wide range of users, they have mailing > lists, forums, conventions, meetings, wish-lists, etc. They know what > people want and need from the compiler. And if there is a feature that > is easy to implement, and would be useful to lots of people, then they > implement it. It is that simple. --no, it is more like this. People tell you what they want. You tell them they are idiots and ignore them and discourage them from telling you what they want. They you say "nobody is saying they want this, I can't hear you!" This has happened to me in absolutely every case I can think of of me telling any compiler writer what I want. I have never seen any example of somebody tells them what they want, then the writer agrees and does it. In my entire life. Now obviously such examples exist. They are just so rare they lie outside the experience of almost every programmer. And it pisses us off when we are treated this way 100% of the time. >> Gcc added partial support for uint128's as posters earlier on this >> thread claimed in order >> to prove I was an idiot. > > No, gcc added partial support for 128-bit integers long ago, because it > was useful to some people. It is very rare that it is useful - the > people who write code that has use of 128-bit integers are doing rather > specialised work. The partial support along with the other builtin > functions in gcc do what they need. --sigh. If you are going to add u128 partial support, and you yourself already said it is not very hard to program u128 using only u64's, and you already said you yourself did write such code, then I have to wonder why it is that you intentionally did an incomplete job with u128s in gcc. All you had to do was put that "not hard" code into gcc to handle the architectures you did not want to actually code the machine code for, and voila -- complete support on every architecture (slower on some than it needs to be, but always works). And then portable code could be written, for a refreshing change. > If you read what people wrote, and perhaps applied a little thought and > common sense rather than jumping to arguments, ignorance, arrogance and > insults, then perhaps /you/ would learn something. --sigh. I have already documented instances of the opposite. In fact, see the preceding paragraph. >> --then it would be 3-10 times slower. > > unsigned __int128 mult128(uint64_t a, uint64_t b) { > return (unsigned __int128) a * b; > } --this is illegal code as far as my machine and gcc compiler are concerned. > On 64-bit gcc, that compiles to: > > mult128(unsigned long, unsigned long): > movq %rdi, %rax > mulq %rsi > ret > > Tell me again how that is so slow? --that's fine. > I believe you haven't the faintest idea what effort is involved in > adding a feature to the compiler. Even assuming that the basic > implementation of a "multiply with carry" function was simple, there is > testing, documentation, automatic test suites, validation on > combinations of some 30+ targets, 10+ hosts, and countless selections of > compiler options, and so on. And that is after the discussions about > how it will be used, how it interacts with other functions, how it may > conflict with future standards, and coordinating with other projects > such as clang. gcc is a /big/ project, and it is a high quality project > - it is not a one-man hobby program where you can just throw in a new > feature with a few lines of code. --seems to me, there must be a big list of builtin functions, each coming with machine code implementations. You just add another item to the list. >>so your stance is, those previous library/standards developers were >> idiots? > > Eh, no. My stance is that div() has existed since the earliest versions > of the C standards, so it must be supported in current compilers too. > It is usually unnecessary to use div() in code that you write /today/, > because modern compilers do a fine job of optimising arithmetic > functions. And if one were to design a new C-like programming language > /today/, it is unlikely that a "div" function would be part of the > standard library. But it was a different matter forty years ago or so > when C was first conceived. So the "div" function is mainly historical. --aha. So you admit historical errors were made? (Like... wordsize arbitrary choices that got there purely by historical accident... and by accident got utterly stupid names like "unsigned long long" which now have become somewhat better names like "uint64_t") > Well, I can't answer for what anyone may have said 15 years ago or > whatever. It certainly seems a long time to hold a grudge from a > difference of opinions. --it isn't so much a "grudge" as "acquired understanding of the plain fact that every compiler writer I ever encountered, is a total bullshit artist, and therefore, I am no longer going to be intimidated by their assertions of their great expertise and my idiocy as an answer to everything." >> --I did not see this in the docs, sorry. I did see a doc claiming >> specifically that it was not available in gcc -- but there was some >> experimental way to obtain it, which the gcc developers had >> intentionally chosen not to put in the official code, for reasons they >> refused to say. And no directions whatever were given about how this >> unsupported experimental >> thing was to be used. > > The warning flag "-Warray-bounds" is enabled by the common warnings > "-Wall". --I still am confused. I have been using -Wall in my gcc compilations for decades. And yet, gcc has never compiled array bounds checking into my C programs in all those decades. >>> And for any other use where array bounds checking would be costly at >>> run-time, you have to implement it yourself in C. C arrays maximise >>> efficiency and simplicity at the cost of safety. >> >> --And that is why I suggested this as a compiler OPTION: >> gcc -ArrayBoundsChecksPlease progname.c >> > > There are two options: > > -fsanitize=bounds > > This is a "sanitizer", using an extra run-time library, working on a > range of systems, but with extra code size and speed costs. > > -fcheck-pointer-bounds > > This is efficient, flexible and sophisticated, with a good deal of > user-level control - but it requires hardware support that is currently > only available on a few of Intel's cpus. --sounds exciting. None of that works on my machine. Googling "gcc -fsanitize" finds nothing. I did find https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Pointer-Bounds-Checker-builtins.html#Pointer-Bounds-Checker-builtins which seems ultra-ugly syntax, but for those who can get past that, plausibly VERY useful. Then in https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc.pdf I finally found discussion of fsanitize=address and fsanitize=otherstuff, and that seems EXCELLENT. I think this is going to change my life. >>>> Also profiling. >>> gcc has supported profiling for decades, I believe. >> >> --not really, far as I can tell. If a compiler were to, in each >> "basic block" of the code, insert >> counters, then after the run of the code, the counters were to be >> output, that would be genuine profiling, accomplished by the compiler, >> not by some external tool. > > gcc does this, as far as I can know (I haven't read the details for a > long time), but at a function level. There is no clear definition of a > "basic block" to use (since C99), and any attempt to add profiling code > to something like every loop or conditional would cripple the optimiser > and totally dominate the run time - you'd get profile data that bore > virtually no relation to the real timings of the normal optimised code. --well, it quite plausibly would heavily damage timing data, but it would still get COUNT data for basic blocks 100% correct. Such counting is a different kind of profiling than time measuring, but both are useful. Further, the damage to timing data would be partially correctable by some simple data post-processing. > Back in the old days, when compilers were little more than C-to-assembly > translators, this might have made sense. Basic block profiling was > possible with the old Turbo Pascal tools, for example - helped by Pascal > being a simpler language with a clearer definition of a basic block. --Jon Bentley in one of his programming pearls works, advocated that every programmer write their own profiler. This can be accomplished in C, more or less, by basically #defining a large set of ugly macros that redefine the C language as your personal language. E.g. if(a){ b } is replaced by some macro like MyIf(a,b). Then you write program in your personal language, which is just like C only uglier. Then, you set a flag which activates or turns off profiling-code generation by the macros. Voila. OK, now let us think. Every programmer should do this, says luminary Jon Bentley. Even though it is a disgusting pain. If so, I think it is obvious that compilers should be able to do it for you so it isn't such a huge pain. > There is an additional tool gprof that is made by the gcc folks, that > analyses the profile file and can be used for feedback-directed > optimisation (i.e., run the program to see which functions are used a > lot and which are the bottlenecks, and put more emphasis on optimising > them in the next run). > >> (Counters could count how many times each basic block executed, and/or >> time >> elapsed within each, under control via user compiler flags and/or >> pragmas allowing >> more precise control, e.g. only do it for this part of the code, not that >> part.) >> This would be very very very easy for a compiler to do. > > Again, you don't have the slightest concept of what would be "easy for a > compiler to do". --well, I just explained how to do it without a compiler, so... > Instrumenting /function/ entry and exit is relatively straightforward, > and gcc has many options and function attributes to support that. > > However, since modern optimisation involves a good deal of > re-arrangement of functions (inlining, cloning, combining, partial > inlining, link-time optimisation, run-time selection of functions, > etc.), even function-grain profiling is of limited value. > >> If a COMPILER puts stuff like that (or arrays bounds checks) into the >> code, >> that is way easier and better, than if, instead, some external tool, >> fighting >> against the compiler to overcome its limitations, tries to do it, or >> partially do it. >> If, nevertheless, people are so desperate to get some functionality >> that they develop such external tools ANYWAY, then that to me is clear >> proof there was a large enough need, that it ought to have been >> offered by the compiler as an option. That way that functionality >> would be available better and easier with no need for that external >> tool. >> >>>> I'd like elsif. >>> >>> #define elsif else if >> > > I thought it was clear that this was a joke solution to a non-existent > problem. > >> --sigh. A well known flaw in C, making bugs more likely for no good >> reason, >> is the "else if ambiguity." It is mentioned in lots of >> basic programming texts. Here there is even a wikipedia page on it: >> https://en.wikipedia.org/wiki/Dangling_else#C >> As it says this has been recognized since 1960. >> A well known cure for that is to have an elsif >> instruction. Your "fix" fails to fix this problem. > > No, the cure for "dangling else" is the same as the cure for most of the > problems with "if" (and "while", "for", and "do") - always use brackets. > > if (a) { > ... > } else if (b) { > ... > } else { > ... > } > > See? No problem. --Sigh. We'd diminish such problems by providing elsif. Or, we could just say "you programmers are idiots who need to teach yourself good habits." Sure that'd work, except for all the beginner programmers who tripped on this same banana peel over and over, totally avoidably. Why not help eliminate such banana peels? And note, C already provides #elsif. Why is that? It is just historical accident. The guy who write the preprocessor was aware of and cared about this problem, the different guy who wrote the main C wasn't or didn't. Again, whenever you contradict yourself, we know you were wrong somewhere. > In my opinion, the C language designers made a mistake in allowing "if" > statements without the brackets. --I agree with that opinion. But even if they were demanded, it'd still be less convenient than if elsif were provided, because of avoidable multiple nesting needs. >>>> I'd like more sophisticated compile time stuff, like right now >>>> they have #if, #elsif, #endif. Ok, why not #for? That way we could >>>> unroll >>>> loops by "doing the loop at compile time" not runtime. (Need to make >>>> a language subset intentionally weakened to not be turing complete, >>>> i.e. >>>> we >>>> want to know for sure the compile always will terminate, but still >>>> precompiler language could be a good deal more powerful than now.) I >>>> could discuss that. >>>> I'd like a compile-time language giving you a fair amount of power, but >>>> below turing-power, and >>>> acting as though it were sensibly designed in from start with similar >>>> syntax >>>> (within reason) to the actual runtime language -- not intentionally >>>> different syntax for no reason aside from trying to annoy people, and >>>> not an obvious crude add-on. >>> >>> You'll find C++ next door. It is a marvellous language, and includes >>> all of the features you ask for here. And with a little practice (or >>> googling for libraries) you can also get your 128 bit integers, safe >>> strings, and everything else you are keen on. >> >> --Sigh. >> Actually, C++ is fundamentally flawed in this precise respect, namely, >> its "precompiler language" is Turing complete, making it Turing >> undecidable whether any given C++ program will even compile, and, e.g. >> there are C++ programs which will compile if and only if the Riemann >> hypothesis is false. > > So what? Just write C++ programs that /do/ complete their compilation. > Or if you accidentally write an infinite loop, break the compiler and > fix the code. --oh god. Well, this attitude is just beyond the pale for me. > Do you know anything much about computability, and the halting problem? --yes. > It is impossible to make a program that can confirm that any given > program in a Turing complete system will halt. This means you cannot > write a software tester that will take any program and determine whether > or not it will stop. And since C++ templates are Turing complete, there > is no way to write a compiler that will take a C++ program and determin > if it is valid or not. > > But is this a problem? > > No, of course it is not a problem. It is totally irrelevant to software > development. (Long compile times, which sometimes plague C++ programs, > is a problem - one that compiler developers work hard to improve.) It > is also solveable by simply adding limits to template recursion depth > (compilers either have a fixed limit, or command-line switches). --that'd still make it at least superexponential. >> But I'm confused >> about whether this is really offered... my confusion is because >> I am reading this doc dated 2011 >> https://gcc.gnu.org/wiki/FunctionSpecificOpt >> where it says "Stage1: Syntax for optimization option using attributes" >> and it seems to give the impression its author thinks this would be a good >> idea, >> but that it hasn't happened inside gcc yet; and >> https://gcc.gnu.org/onlinedocs/gcc/Pragmas.html >> does not seem to mention this. I would suggest editing this latter page >> to add >> "optimization control" as a new section, if this exists. >> And re the earlier page, I fully agree something of this ilk would be >> a fine idea. > > Function-specific optimisation is possible, and has its uses - but it > also has its complications and limitations. If you declare that one > function should be unoptimised (say, for debugging purposes) with -O0, > and another function should be optimised for top speed with -O3, what > should the compiler do if the -O3 function calls the -O0 function inside > a loop? What if it is inling the function? Or what if you change > options that may affect correctness of code (such as strict aliasing > options)? What happens if you are doing link-time optimisation, and > have several identical definitions of the functions (this is legal in > C++ and is a common result of templates), but have different options in > different files? > > The general advice is therefore that you /can/ do function specific > optimisation, but it is rare that you /should/ do it. And you /can/ do > it with pragmas (for people who like pragmas), but it is better to use > function attributes. It is better to either let the compiler figure out > which functions need extra speed (it's getting quite good at it), or use > more generic function attributes like "hot" and "cold". --sounds sensible. >> Incidentally, another annoyance in C which it'd be nice to fix, and >> not hard to fix at >> all, would be this: >> int foo( int a, int b, int c, int d, double e ){ >> code >> } >> is silly. It would be better if we could write it as >> int foo( int a,b,c,d; double e ){ >> } >> getting rid of the extra type names which just waste space and serve >> no useful function. > > I can't say that this bothers me at all. And it just opens new ways to > cause confusion like: > > int foo(char* p, q); > > Is "q" a "char" or a "char*" ? --a char* (or rather, it should be, in my view). Anyhow I think this particular ambiguity is already present in C for variable declarations, no? I mean, my gripe is that variable declarations can be concise, but in function argument lists they cannot be, which is annoyingly inconsistent and also hits you where it hurts. And I think conciseness is important. Not so much for saving keystrokes but rather for reading + comprehending + checking code faster. > One feature of function declaration or calling that could be added to C > to improve it (i.e., help people write clear, correct, safe, robust code > - rather than helping people save a keystroke or two) it would be > designated parameters. i.e., after the declaration: > > int foo(int a, int b, int c, int d, double e); > > you could call foo like this: > > foo(1, 2, 3, .d = 5, .e = 2.5); > foo(.d = 5, .a = 1, .c = 3, .e = 2.5, .b = 2); > > // Alternative syntax > foo(d: 5, a: 1, c: 3, e: 2.5, b: 2); > > > > and get the same effect. --yes, that sort of feature would be nice, ada has some stuff like that. >> --well, I have some understanding of these things. > > Every object in C needs to have a unique address. It is impossible to > have the unique address of an object smaller than a "char", and the > minimum size of a char is 8 bits. (A "_Bool" typically has 8 bits too - > 1 value bit, and 7 padding bits.) This is fundamental to C. --if a char has an address, then the address of a bit inside that char would be got by appending 3 more bits. Also, even if C were to implement packed bools but refused to allow pointers to them, that still would be a big improvement. -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)