I also wrote a renumberer and packer in bash (actually in awk too before
that, still in the repo in the "attic")
https://github.com/bkw777/BA_stuff
The interface could be improved to operate on normal commandline
arguments instead of env variables and only piping stdin and stdout.
The renumberer in particular actually gets some odd but legal constructs
right that some others like fail to.
I was trying to use
https://github.com/LivingM100SIG/Living_M100SIG/blob/main/M100SIG/Lib-08-TECH-PROGRAMMING/RESEQ.100
to renumber https://ftp.whtech.com/club100/drv/sector.ba, and it was
getting one thing wrong. sector.ba includes a command with a comma
seperated list of line numbers, with some of the values empty:
ONAGOSUB310,311,312,,,,316,317,318,319
and reseq didn't handle that right. I don't know what other renumberers
do because I decided at least for what I was working on it would be more
convenient to renumber on the hos than on the 100. But if you're writing
on the 100 it obviously would be handy to have the ability on the 100. I
think one of the option roms like cluseau has a renumberer in rom? I
would probably try to use that if I were writing on the 100 simply to
avoid consuming ram.
The packer might be slightly wrong in one aspect. It converts all prints
to ?s and rems to 's, which does make the ascii file smaller, but I have
since read somewhere that one or both of those may actually result in
slightly more ram usage on the 100? I haven't performed a test to find
out. So the packer could maybe use an option for that.
I might try to add line-unwrapping to the packer next. There are some
unwrapping that would be too involved to try to do, like keeping track
of broken but recombinable prints and other literals. But it should be
no problem to at least do some combining where if the end of one line is
not something like a THEN branch, and the next line number is not a jump
target anywhere else in the file, and the total new combined line would
be under either the 127 or 254 threshold (whichever you want) it's ok to
combine.
--
bkw
On Tue, Feb 28, 2023, 10:33 PM grima...@gmail.com <grima...@gmail.com>
wrote:
I used Renum1 from Club100 library.
I have inspected the tokenized BA in a hex editor. As far as I can
tell, line numbers aren’t really compressed in any way. So in my
original program, most of my line numbers were between 1000-30000,
and each reference to them was 4-5 bytes.
Now most of my lines are 1-3 bytes after renumbering from 1.
I also do use Packer.BA from Club100. This removes comments, and
combines lines that aren’t referenced by GOTO, GOSUB, etc.
Best,
George
On Tue, Feb 28, 2023 at 9:49 PM B 9 <hacke...@gmail.com> wrote:
On Tue, Feb 28, 2023 at 4:55 PM grima...@gmail.com
<grima...@gmail.com> wrote:
Thanks all!
At some point I’ll look into adding Tokenization directly
into Github.
Awesome. It looks like compiling and running a C program may be
trivial in the yaml file:
-uses:actions/checkout@v3
|- run: | make ./tokenize FOO.DO|
By the way, you may be able to use a Python lexer, such as ply
<https://www.dabeaz.com/ply/ply.html>, to create a Python
program from my flex source code. However, I suspect that will
be more work than it's worth.
I also used a line renumberer which brought down the .BA
file to 76% of the previous version.
Wow. What renumberer did you use? And why did renumbering reduce
the file size?
By the way, a tokenizer should be able to reduce the file size
dramatically by simply omitting the string after REM statements.
Having it remove vestigial lines completely would be slightly
trickier and probably require a second pass as it'd have to make
sure the line was not a target of GOTO (or any of the other
varied ways of referring to line numbers).
—b9