I also wrote a renumberer and packer in bash (actually in awk too before that, still in the repo in the "attic")

https://github.com/bkw777/BA_stuff


The interface could be improved to operate on normal commandline arguments instead of env variables and only piping stdin and stdout.

The renumberer in particular actually gets some odd but legal constructs right that some others like fail to.

I was trying to use https://github.com/LivingM100SIG/Living_M100SIG/blob/main/M100SIG/Lib-08-TECH-PROGRAMMING/RESEQ.100

 to renumber https://ftp.whtech.com/club100/drv/sector.ba, and it was getting one thing wrong. sector.ba includes a command with a comma seperated list of line numbers, with some of the values empty:

ONAGOSUB310,311,312,,,,316,317,318,319

and reseq didn't handle that right. I don't know what other renumberers do because I decided at least for what I was working on it would be more convenient to renumber on the hos than on the 100. But if you're writing on the 100 it obviously would be handy to have the ability on the 100. I think one of the option roms like cluseau has a renumberer in rom? I would probably try to use that if I were writing on the 100 simply to avoid consuming ram.


The packer might be slightly wrong in one aspect. It converts all prints to ?s and rems to 's, which does make the ascii file smaller, but I have since read somewhere that one or both of those may actually result in slightly more ram usage on the 100? I haven't performed a test to find out. So the packer could maybe use an option for that.

I might try to add line-unwrapping to the packer next. There are some unwrapping that would be too involved to try to do, like keeping track of broken but recombinable prints and other literals. But it should be no problem to at least do some combining where if the end of one line is not something like a THEN branch, and the next line number is not a jump target anywhere else in the file, and the total new combined line would be under either the 127 or 254 threshold (whichever you want) it's ok to combine.

--
bkw

On Tue, Feb 28, 2023, 10:33 PM grima...@gmail.com <grima...@gmail.com> wrote:

   I used Renum1 from Club100 library.

   I have inspected the tokenized BA in a hex editor. As far as I can
   tell, line numbers aren’t really compressed in any way. So in my
   original program, most of my line numbers were between 1000-30000,
   and each reference to them was 4-5 bytes.

   Now most of my lines are 1-3 bytes after renumbering from 1.

   I also do use Packer.BA from Club100. This removes comments, and
   combines lines that aren’t referenced by GOTO, GOSUB, etc.

   Best,
   George

   On Tue, Feb 28, 2023 at 9:49 PM B 9 <hacke...@gmail.com> wrote:



       On Tue, Feb 28, 2023 at 4:55 PM grima...@gmail.com
       <grima...@gmail.com> wrote:

           Thanks all!

           At some point I’ll look into adding Tokenization directly
           into Github.


       Awesome. It looks like compiling and running a C program may be
       trivial in the yaml file:

       -uses:actions/checkout@v3

       |- run: | make ./tokenize FOO.DO|


       By the way, you may be able to use a Python lexer, such as ply
       <https://www.dabeaz.com/ply/ply.html>, to create a Python
       program from my flex source code. However, I suspect that will
       be more work than it's worth.

           I also used a line renumberer which brought down the .BA
           file to 76% of the previous version.


       Wow. What renumberer did you use? And why did renumbering reduce
       the file size?

       By the way, a tokenizer should be able to reduce the file size
       dramatically by simply omitting the string after REM statements.
       Having it remove vestigial lines completely would be slightly
       trickier and probably require a second pass as it'd have to make
       sure the line was not a target of GOTO (or any of the other
       varied ways of referring to line numbers).

       —b9

Reply via email to