Sorry to be contrarian .. but you did invite this with this subject line :-)
On Sun, Sep 7, 2014 at 4:21 AM, Jonathan S. Shapiro <[email protected]> wrote: > On Fri, Sep 5, 2014 at 6:31 PM, Ben Kloosterman <[email protected]> > wrote: > >> IMHO a better test would be write it ASAP get the language out , then see >> how hard it is to re-factor when you self host. I bet that will change the >> whole way its structured and force language changes to make it easier. >> > > How would you go about that? Remember that we've already done a C++-hosted > version of this compiler using YACC. > That is very out of date version compared to some of the things that have been discussed. I was thinking get it self hosting first use yacc . Get yacc to create bitc code instead of C IF it proves problematic and then keep improving. > > >> "*ad hoc* parse rules and/or parse ambiguities." are common in nearly >> all languages - with good reason they have taken short cuts to get >> something out the door. >> > > For many languages that is so. *None* of those languages are languages > that anyone has attempted to do regular formal reasoning about > successfully. > I can turn that around too :-) Are there any successful languages that have done formal reasoning ? > > >> Javascript and Linux are a great examples -our world is filled with >> second rate products because the better ones never get finished. >> > > Agreed. And if their approach had led to sufficiently survivable systems, > we wouldn't be attempting to do BitC at all. But it didn't, did it? > Javascript continues to grow and improve. C# and Java are very survivable they just haven't done the lower level stuff - C# could have replaced a lot of C if they had gone just a bit further in that direction. There are other languages like Rust which may close the window ( though i think rust is too hard to use for the average dev) ,and new ones are coming. C# ( like most) has had massive overhauls and safe set based operations on LINQ have probably doubled the productivity of devs in the last few years. > > >> The whole purpose of modern languages / techniques is to be able to >> re-factor and improve it... >> > > That's just not true. But it *is* true that incremental re-factoring and > improvement in place of disciplined design has been the fad in computing > for the last ten years. That has happened for reasons of economic and > competitive pressure rather than technical merit. > No it has happened for many reasons , As you explore more alternatives and design changes are forced on you . And business ( and the world) can change very quickly these days. A flexible loose design rather than a specific but fixed one will be better able to handle inevitable change as we learn more ( person , team , business , community) or as the preconditions change. Most project have fixed budgets , when there is too much over engineering ( which is a bigger problem on many systems) or unexpected things come up then you need to make short cuts. Most project i see with great / fancy designs NEVER get finished. I saw a good one go belly up recently CSIRO which is Australia's government research organization blew a good 10M building an underground automated mining truck to put explosives in the ground .. The main reason it failed was because they were not happy with off the shelf components so they build there own robot arm , operating system etc. By the time it got to the core work most of the budget was gone and they couldn't overcome some serious real world issues with what was left. This means to me the more ambitious the goal the more i want to see the core system working with key high risk features which is then improved. Once you have a working system the priority list becomes far more accurate. For bitc this is things like mutable types , copy , regions / memory management , new type system & interfaces , multiple code units and type classes. After the initial version you were confident you had a good handle on this all of which i agree with but I would like to see these ( and yes im willing to help or more) . You may say A new parser is just a nice to have to stop working with some dirty things . To me use yacc or whatever to get it working if its too hard than change . > It has left us with systems that are not just insecure, but *not > securable in principal*. BitC is a step on the path to changing that. > It's goals can't be met by making it up as we go along. > I dont think the development approach has anything to do with security ( though i appreciate how hacks with token make it difficult to verify this part) . Security mainly depends on having a security model and sticking to it.. We are left with poor security models because when most of these things were designed it was less of an issue and we have learned..The new Windows 8 API has a great security model , which is better than android , which is better than C# which is better than javascript which is better than Java which is better than Modula which is better than C . ( some of these are arguable but it basically a timeline and we have learned better techniques) > > >> ( which is the main reason the parser and tokeniser are split - while its >> worse / less efficient than a combined custom implementation it does allow >> better / simpler code ). >> > > Can you cite a source for that (fairly implausible) assertion? > Dont see how this is implausible . Just common sense - strange that you should question it. Layering / stages always is less efficient but allows simplifications . A single layer / stage allows tricks to do both which require hacks in layers/ stages. > > Historically, both parts were done by hand, and they *were* fused. Parser > generators came about because maintaining parsers by hand is too hard to > allow reasonable maintenance or refactoring. Lexer generators came along > with the same idea, but didn't prove to add enough value in practice to > justify themselves. > I dont think we are disagreeing .. thats basically what i was saying ( or intended to) layers / separate components to simplify and improve maintenance ( and can sometimes be heavily optimized for their domain) . You can write a faster with less dirty hacks single stage ( but harder to maintain / more complex/ more costly) . > > The usual evolution of these things is that languages use a parser > generator until the petrify, and then shift to a recursive descent > hand-written parser for the sake of better error handling. This is actually > why I'm doing an LL grammar rather than an LR grammar; the LL grammar can > emit a recursive descent parser directly. > > I agree with this but the standard languages may have gotten the core components out .. and then with more people have the resources to dedicate someone to improving the parser / token generator. Even if you do get large independent funding that will be more likely with core concepts in place and partially proven. > One of the features that I initially resisted in BitC was layout. It turns > out that layout is useful, but its implementation in most languages is a > *mess*. Most of the popular layout schemes violate the parser/tokenizer > phase boundary. The most notorious of these is the Haskell reliance on > parse-error(t) in the L function (L, incidentally, may be the most opaque > specification I've ever seen). > > :-) > > Michael Adams actually makes the point very well with regard to parsing of > layout: > > The lack of a standard formalism for expressing these layout rules and of > parser generators for such a formalism increases the complexity of writing > parsers for these languages. Often, practical parsers for these languages > have significant structural differences from the language specification.. > [T]he structural differences between the implementation and the > specification make it difficult to determine if one accurately reflects the > other. > > > That statement could equally well be made of most modern language grammar > specifications. C++ grammars, for example, are plagued with ambiguities and > phase boundary violations that *commonly* lead to subtle disagreements > between implementations, and it is not really possible to know from the > standard which implementation is correct. > > > Perhaps you have the impression that the parser generator is what has been > slowing me down. Actually, that's not true. There are a lot of other things > going on here, and I'm not getting to spend a whole lot of time on BitC. I > make progress as I can. > Yes i did have that impression.. I know you want a great design and you have that .. I just want to see it in reality ( and work on the GC once its self hosted) and not be caught in design paralysis. Ben
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
