Re: My baby text to HTML paragraph converter
Hi All,My two cents to this is that bison can be used with C++ as well :) See:https://www.gnu.org/software/bison/manual/bison.html#A-Simple-C_002b_002b-ExampleBR,r0ller Eredeti levél Feladó: Aryeh Friedman Dátum: 2023 december 17 00:39:13Tárgy: Re: My baby text to HTML paragraph converterCímzett: Steve Litt On Sat, Dec 16, 2023 at 5:47 PM Steve Litt wrote: > > Piotr Siupa said on Sat, 16 Dec 2023 02:47:52 +0100 > > >Hi Steve, > > > >Sorry if I'm reading too far into it but from the fact you're using a > >shell script for building this, I'm assuming that you're pretty new to > >work with bigger programming projects. > >Because of that, I'm going to give you some homework. Specifically, I > >think you should research two things: > >- Build automation tools - They (similarly to your script) build the > > I use make on big projects that take more than 30 seconds to compile > via a "compile and link everything" script, but at compile times below > that, I just use "compile and link everything". I make lots of > mistakes, and find "compile and link everything" makes me less > mistake prone than make. If you have problems with Make often repeating steps unnecessarily it is due to the fact it does not view the entire build process as one big DAG (directed acyclic graph). For more information see the following older but still very relevant paper: https://aegis.sourceforge.net/auug97.pdf ("Recursive Make Considered Harmful", Peter Miller 1997).If you find the paper helpful you might want to look into the build system Peter made in response to his own paper: cook (https://petermiller.work/pmiller/software/cook/) Disclaimer: I wrote the official tutorial for it. > I've never authored anything requiring autoconf, and hope I never have > to. One of my fundamental beliefs is that I should use as few > dependancies (especially Other People's Code) as possible, because every > layer of abstraction complexifies the code and makes troubleshooting > more difficult. My low use of dependencies lessens the need for > autoconf. As far as portability, I'd rather #ifdef that into my code > than use autoconf. > > Another thing about me: I try very hard to write my C code such that > gcc -Wall is silent. Even "harmless" warnings are harmful because they > disguise the genuinely harmful ones. I also make sure my HTML5 is well > formed XML and passes an XML parser, and validates via the W3C > validator. The Troubleshooters.Com web pages I've written in the past 3 > years look identical on all reasonably standards compliant browsers > that allow Javascript. Have you considered a language that is more conducive to good software engineering like Java. I switched from C to Java about 10 years ago after using C for 20 years and one too many bugs caused by stray pointers. Also C is very hard to unit test except in the most trivial cases. If you do switch, the equivalent of Bison in Java is ANTLR. As an added bonus there is no need for #ifdef's and other weirdness needed to make it portable since all implementations of the JVM follow the same language/standard library standard (unlike C and the various OS combos). As far as testing goes see below > > > >- Version control - It tracks all changes you do to your project, > > I don't know how you found out I don't use version control, but your > right, I'm lousy at git and that has to change. I'm OK with git until I > have to deal with branches, and then I go to pieces. This, and the fact > that the only human language I speak is English are two of my worst > flaws. > > I'll re-remind myself to try learning more about git. Thanks for your > reminder. One of the weakest aspects of traditional mainstream version control (git, cvs, svn, etc.) is it does not force you to prove that your code works before it enters the baseline (i.e. pass all of it's own tests). Peter Miller's aegis (https://aegis.sourceforge.net/) does this and thus the combo of cook and aegis is the only thing I trust in my larger projects like a soft life critical web portal for medical IoT (remote cardiac monitoring). This is because any other combo either breaks or makes it easy to break one of Peter Miller's laws of software construction (https://en.wikipedia.org/wiki/Peter_Miller_(software_engineer)) which are: 1. The number of interactions within a development team is O(n!) without controlled access to the baseline. If the development team does have controlled access to the baseline, interactions can be reduced to near O(n), where n is the number of developers and/or files in the source tree, whichever is larger. 2. The baseline MUST always be in working order. 3. The software build/construction process can be reduced to a directed, acyclical graph (DAG). 4. It is necessary to build a rigid framework of selected components (aka the top level aegis design). 5. The framework should not do any real work, and should
Re: My baby text to HTML paragraph converter
On Sat, Dec 16, 2023 at 5:47 PM Steve Litt wrote: > > Piotr Siupa said on Sat, 16 Dec 2023 02:47:52 +0100 > > >Hi Steve, > > > >Sorry if I'm reading too far into it but from the fact you're using a > >shell script for building this, I'm assuming that you're pretty new to > >work with bigger programming projects. > >Because of that, I'm going to give you some homework. Specifically, I > >think you should research two things: > >- Build automation tools - They (similarly to your script) build the > > I use make on big projects that take more than 30 seconds to compile > via a "compile and link everything" script, but at compile times below > that, I just use "compile and link everything". I make lots of > mistakes, and find "compile and link everything" makes me less > mistake prone than make. If you have problems with Make often repeating steps unnecessarily it is due to the fact it does not view the entire build process as one big DAG (directed acyclic graph). For more information see the following older but still very relevant paper: https://aegis.sourceforge.net/auug97.pdf ("Recursive Make Considered Harmful", Peter Miller 1997).If you find the paper helpful you might want to look into the build system Peter made in response to his own paper: cook (https://petermiller.work/pmiller/software/cook/) Disclaimer: I wrote the official tutorial for it. > I've never authored anything requiring autoconf, and hope I never have > to. One of my fundamental beliefs is that I should use as few > dependancies (especially Other People's Code) as possible, because every > layer of abstraction complexifies the code and makes troubleshooting > more difficult. My low use of dependencies lessens the need for > autoconf. As far as portability, I'd rather #ifdef that into my code > than use autoconf. > > Another thing about me: I try very hard to write my C code such that > gcc -Wall is silent. Even "harmless" warnings are harmful because they > disguise the genuinely harmful ones. I also make sure my HTML5 is well > formed XML and passes an XML parser, and validates via the W3C > validator. The Troubleshooters.Com web pages I've written in the past 3 > years look identical on all reasonably standards compliant browsers > that allow Javascript. Have you considered a language that is more conducive to good software engineering like Java. I switched from C to Java about 10 years ago after using C for 20 years and one too many bugs caused by stray pointers. Also C is very hard to unit test except in the most trivial cases. If you do switch, the equivalent of Bison in Java is ANTLR. As an added bonus there is no need for #ifdef's and other weirdness needed to make it portable since all implementations of the JVM follow the same language/standard library standard (unlike C and the various OS combos). As far as testing goes see below > > > >- Version control - It tracks all changes you do to your project, > > I don't know how you found out I don't use version control, but your > right, I'm lousy at git and that has to change. I'm OK with git until I > have to deal with branches, and then I go to pieces. This, and the fact > that the only human language I speak is English are two of my worst > flaws. > > I'll re-remind myself to try learning more about git. Thanks for your > reminder. One of the weakest aspects of traditional mainstream version control (git, cvs, svn, etc.) is it does not force you to prove that your code works before it enters the baseline (i.e. pass all of it's own tests). Peter Miller's aegis (https://aegis.sourceforge.net/) does this and thus the combo of cook and aegis is the only thing I trust in my larger projects like a soft life critical web portal for medical IoT (remote cardiac monitoring). This is because any other combo either breaks or makes it easy to break one of Peter Miller's laws of software construction (https://en.wikipedia.org/wiki/Peter_Miller_(software_engineer)) which are: 1. The number of interactions within a development team is O(n!) without controlled access to the baseline. If the development team does have controlled access to the baseline, interactions can be reduced to near O(n), where n is the number of developers and/or files in the source tree, whichever is larger. 2. The baseline MUST always be in working order. 3. The software build/construction process can be reduced to a directed, acyclical graph (DAG). 4. It is necessary to build a rigid framework of selected components (aka the top level aegis design). 5. The framework should not do any real work, and should instead delegate everything to external components. The external components should be as interchangeable as possible. 6. The framework should use the Strategy pattern for most complex tasks. > > > SteveT > > Steve Litt > > Autumn 2023 featured book: Rapid Learning for the 21st Century > http://www.troubleshooters.com/rl21 > -- Aryeh M. Friedman, Lead Developer,
Re: My baby text to HTML paragraph converter
Piotr Siupa said on Sat, 16 Dec 2023 02:47:52 +0100 >Hi Steve, > >Sorry if I'm reading too far into it but from the fact you're using a >shell script for building this, I'm assuming that you're pretty new to >work with bigger programming projects. >Because of that, I'm going to give you some homework. Specifically, I >think you should research two things: >- Build automation tools - They (similarly to your script) build the I use make on big projects that take more than 30 seconds to compile via a "compile and link everything" script, but at compile times below that, I just use "compile and link everything". I make lots of mistakes, and find "compile and link everything" makes me less mistake prone than make. I've never authored anything requiring autoconf, and hope I never have to. One of my fundamental beliefs is that I should use as few dependancies (especially Other People's Code) as possible, because every layer of abstraction complexifies the code and makes troubleshooting more difficult. My low use of dependencies lessens the need for autoconf. As far as portability, I'd rather #ifdef that into my code than use autoconf. Another thing about me: I try very hard to write my C code such that gcc -Wall is silent. Even "harmless" warnings are harmful because they disguise the genuinely harmful ones. I also make sure my HTML5 is well formed XML and passes an XML parser, and validates via the W3C validator. The Troubleshooters.Com web pages I've written in the past 3 years look identical on all reasonably standards compliant browsers that allow Javascript. >- Version control - It tracks all changes you do to your project, I don't know how you found out I don't use version control, but your right, I'm lousy at git and that has to change. I'm OK with git until I have to deal with branches, and then I go to pieces. This, and the fact that the only human language I speak is English are two of my worst flaws. I'll re-remind myself to try learning more about git. Thanks for your reminder. SteveT Steve Litt Autumn 2023 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21
Re: My baby text to HTML paragraph converter
Hi Steve, Sorry if I'm reading too far into it but from the fact you're using a shell script for building this, I'm assuming that you're pretty new to work with bigger programming projects. Because of that, I'm going to give you some homework. Specifically, I think you should research two things: - Build automation tools - They (similarly to your script) build the program for you. However, they are a little smarter about it and keep track of which files are already up-to-date with their sources which decreases the time needed to finish the build. (You'll quickly learn to appreciate that as your project gets bigger.) You can also order them to create only a specific file or you can define some aliases for common operations. You could start with "make". It's not the tool I use but it's very simple and popular so you won't have any trouble finding tutorials. It should be already preinstalled on your system. - Version control - It tracks all changes you do to your project, letting you compare or revert files to previous versions. That was the one-sentence summary but there is a whole lot more to it. It's generally a must have for any serious project. I don't think I'm in any way controversial by recommending you to choose "Git". This topic is much more complicated than "make" but the basics should be simple enough to learn in a few hours. regards, Piotr On Thu, Dec 14, 2023 at 8:21 AM Steve Litt wrote: > Hi all, > > After over a week of trying and asking voluminous questions on this > mailing list (and getting voluminous help, thank you), I finally made a > text to HTML converter that took blank line separated paragraphs and > installed and to surround them. All the relevant files are in > this message body... > > > Preprocessor == > #!/usr/bin/python > # preproc.py > import sys > lines = sys.stdin.readlines() > toptrash = True > for line in lines: > line = line.strip() > if toptrash: > if line == "": > continue > else: > toptrash = False > print(line); > print("\n\n") > > > > === Compilation shellscript = > #!/bin/ksh > > rm paragraphs.tab.c > rm paragraphs.tab.h > rm paragraphs.exb > rm paragraphs.lex.c > rm a.out > > bison --html -d paragraphs.y > flex -o paragraphs.lex.c paragraphs.l > gcc -Wall -o paragraphs.exb paragraphs.tab.c paragraphs.lex.c -lfl > > > # Following line runs the converter > cat dataparagraphs.txt | ./preproc.py | ./paragraphs.exb > > > > === Input File === > Steve was here, > and now is gone, > but left his name > to carry on. > > Flex and Bison: > Use as one. > When you need to parse, > they get it done. > > > > = Output === > Steve was here, > and now is gone, > but left his name > to carry on. > > Flex and Bison: > Use as one. > When you need to parse, > they get it done. > > > > = paragraphs.l = > %option noinput nounput > %{ > #include "paragraphs.tab.h" > %} > > %% > (\n){2,} {return SEP;} > . {strcpy (yylval.y_char, yytext); return CHARACTER; }; > > %% > > > int yywrap(void) > { > return 1; > } > > int yyerror(char *errormsg) > { > fprintf(stderr, "%s\n", errormsg); > exit(1); > } > > > > = paragraphs.y = > %{ > > #include > #include > int yylex(void); > int yyerror (char *errmsg); > #define EOF_ 0 > > int prevtok = 0; > %} > > %union { > chary_char [1]; > } > > %tokenSEP > %tokenCHARACTER > > %% > > wholefile : thing > | wholefile thing > ; > > thing : character | sep; > > character : CHARACTER { > if(prevtok == SEP){ > printf("\n\n"); > } > printf($1); > prevtok = CHARACTER; > } > > sep: SEP { > printf(""); > prevtok = SEP; > } > > %% > > int main(int argc, char *argv[]){ > printf(""); > yyparse(); > if(prevtok == CHARACTER) > printf(""); > printf("\n"); > } > > > Thanks, > > SteveT > > Steve Litt > > Autumn 2023 featured book: Rapid Learning for the 21st Century > http://www.troubleshooters.com/rl21 > >
My baby text to HTML paragraph converter
Hi all, After over a week of trying and asking voluminous questions on this mailing list (and getting voluminous help, thank you), I finally made a text to HTML converter that took blank line separated paragraphs and installed and to surround them. All the relevant files are in this message body... Preprocessor == #!/usr/bin/python # preproc.py import sys lines = sys.stdin.readlines() toptrash = True for line in lines: line = line.strip() if toptrash: if line == "": continue else: toptrash = False print(line); print("\n\n") === Compilation shellscript = #!/bin/ksh rm paragraphs.tab.c rm paragraphs.tab.h rm paragraphs.exb rm paragraphs.lex.c rm a.out bison --html -d paragraphs.y flex -o paragraphs.lex.c paragraphs.l gcc -Wall -o paragraphs.exb paragraphs.tab.c paragraphs.lex.c -lfl # Following line runs the converter cat dataparagraphs.txt | ./preproc.py | ./paragraphs.exb === Input File === Steve was here, and now is gone, but left his name to carry on. Flex and Bison: Use as one. When you need to parse, they get it done. = Output === Steve was here, and now is gone, but left his name to carry on. Flex and Bison: Use as one. When you need to parse, they get it done. = paragraphs.l = %option noinput nounput %{ #include "paragraphs.tab.h" %} %% (\n){2,} {return SEP;} . {strcpy (yylval.y_char, yytext); return CHARACTER; }; %% int yywrap(void) { return 1; } int yyerror(char *errormsg) { fprintf(stderr, "%s\n", errormsg); exit(1); } = paragraphs.y = %{ #include #include int yylex(void); int yyerror (char *errmsg); #define EOF_ 0 int prevtok = 0; %} %union { chary_char [1]; } %tokenSEP %tokenCHARACTER %% wholefile : thing | wholefile thing ; thing : character | sep; character : CHARACTER { if(prevtok == SEP){ printf("\n\n"); } printf($1); prevtok = CHARACTER; } sep: SEP { printf(""); prevtok = SEP; } %% int main(int argc, char *argv[]){ printf(""); yyparse(); if(prevtok == CHARACTER) printf(""); printf("\n"); } Thanks, SteveT Steve Litt Autumn 2023 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21