Re: [Toybox] POSIX 2024 changes

2024-07-27 Thread Rob Landley
On 7/23/24 09:13, Ray Gardner wrote:
> I modified an old HTML diff program by Ian Bicking to run with
> Python3, and wrote a little shell script to compare the 2018 version of
> a utility spec to the 2024 version. Run like:
> 
> ./diffposixutil.sh sed

Useful, thanks. (Modulo I'm changing it to work with my already downloaded
version of posix-2008 back from 2008, since I wanna see _all_ the changes since
then in case I've missed some drip-feed...)

> This downloads the old and new versions of the sed.html spec from
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ and
> https://pubs.opengroup.org/onlinepubs/9799919799/utilities/ and runs
> htmldiff.py, putting the result in dif/sed.html. I've attached the
> shell script, htmldiff.py, and the opengroup style.css from their
> utilities folder.

I miss the ability to download posix versions as tarballs instead of having to
screen scrape them. This was explicitly supported before:

susv4: https://pubs.opengroup.org/onlinepubs/9699919799/download/
susv3: https://pubs.opengroup.org/onlinepubs/009695399/download/

But for susv5:

https://pubs.opengroup.org/onlinepubs/9799919799/download

404 error. And no link from the main index page either, instead there's a
"participants".

"You can't have it, this is ours." That's how you know it's a standard...

> I ran this on the awk spec and got quite a few differences I'll have
> to review, including some that are obvious editorial errors. I'll try
> to bring the errors to the attention of someone at the open group.

While you're at it, ask them why there's an awk.html.orig in the new directory?

> Ray

Thanks,

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] tar.test: don't test non-`-p` behavior as root.

2024-07-27 Thread Rob Landley
On 7/23/24 14:55, enh via Toybox wrote:
> Fixes https://github.com/landley/toybox/issues/512.

Applied, but could you email me the tarball for the tar "ownership" test that
produces the 2d7b hash? I'm getting a d9e7 hash here, and just confirmed the
previous laptop doing TEST_HOST with devuan brochitis was also producing d9e7.
(Which I didn't notice because passing all root tests is under the mkroot todo
item...)

The downside of testing hashes instead of hd output is when it differs and you
can't reproduce the old one, it doesn't say why. You added this hash in commit
43d398ad5d7b and it's possible it never worked for me, but if --owner --group
--mtime aren't squashing all the variables I'd like to know what still varies.

Both toybox tar and host tar are producing the same output, so it's presumably a
filesystem thing, possibly different passwd uids for "nobody"...?

Thanks,

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] devmem: add -f FILE, arbitrary amounts of data.

2024-07-26 Thread Rob Landley
On 7/26/24 07:13, enh via Toybox wrote:
>> Don't know Rob cares about this, but ?: is not ISO C. It's a GNU
>> extension.

Specifically https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Conditionals.html

> with 80 existing uses of https://en.wikipedia.org/wiki/Elvis_operator
> in toybox, i'm pretty sure rob knows this already :-)

Yeah, and it's been discussed multiple times over the years, starting with
http://lists.landley.net/pipermail/toybox-landley.net/2012-April/013359.html
where we were already looking into the portability of it.

I should try to dig something up to put in code.html but the decision way back
when was "if the 2.6 kernel needed it to build, toybox is probably ok relying on
it being there" because we exist in an ecosystem. (The fact I maintained a
tinycc fork for 3 years trying to extend it to make tccboot work with current
vanilla source may have influenced this. :)

I _thought_ this was recorded somewhere in code.html or design.html but it's not
the easiest thing to grep for. There's several links at
https://landley.net/toybox/cleanup.html#advice but this isn't one of them...

There's a couple others things like this, "int x=x;" to shut up the stupid gcc
"is never used uninitialized" warnings comes to mind (although that's migrated
to a QUIET macro that does conditional zero initialization)...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] toybox tar test failure

2024-07-18 Thread Rob Landley
On 7/17/24 07:56, enh wrote:
>> > FAIL: tar honor umask
>> > echo -ne '' | umask 0022 && rm -rf dir && mkdir dir && tar xf
>> > $FILES/tar/dir.tar && stat -c%A dir dir/file
>> > --- expected 2024-07-15 16:20:47.217287424 +
>> > +++ actual 2024-07-15 16:20:47.257287423 +
>> > @@ -1,2 +1,2 @@
>> > -drwxr-xr-x
>> > --rwxr-xr-x
>> > +drwxrwxrwx
>> > +-rwxrwxrwx
>>
>> I can't reproduce it. I just did a fresh clone on the old machine and "make
>> clean defconfig tests" completed, including running the new tar tests.
>>
>> The _symptom_ is that it ran the new tests against old toybox tar from before
>> commit 93718452b9f6. That's the behavior from before the fix. Is your test 
>> setup
>> calling an older toybox tar out of the host $PATH maybe?
> 
> no, this was on a device, so it'll be the /system/bin/toybox from the build.
...
> yeah, that seems to be a mksh compatibility issue.

I suspect the other issue might also be also mksh, or your slightly different
test setup. It could be that the shell umask command isn't setting umask
properly, or maybe something's violating the assumption that the test command
line is in a new process so state changes a test makes to umask, cd, or
environment variables don't persist in the parent context...

I'll see if I can reproduce a mksh run of the test here.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] toybox tar test failure

2024-07-17 Thread Rob Landley
On 7/15/24 12:11, enh via Toybox wrote:
> tried to sync this morning, but i'm getting a new tar test failure:

Saw this yesterday but didn't have the email laptop with me. (I need to move
that over...)

> FAIL: tar honor umask
> echo -ne '' | umask 0022 && rm -rf dir && mkdir dir && tar xf
> $FILES/tar/dir.tar && stat -c%A dir dir/file
> --- expected 2024-07-15 16:20:47.217287424 +
> +++ actual 2024-07-15 16:20:47.257287423 +
> @@ -1,2 +1,2 @@
> -drwxr-xr-x
> --rwxr-xr-x
> +drwxrwxrwx
> +-rwxrwxrwx

I can't reproduce it. I just did a fresh clone on the old machine and "make
clean defconfig tests" completed, including running the new tar tests.

The _symptom_ is that it ran the new tests against old toybox tar from before
commit 93718452b9f6. That's the behavior from before the fix. Is your test setup
calling an older toybox tar out of the host $PATH maybe?

> also (since possibly WAI, and not a blocker because we ignore toybox
> failures in tests if we're not using the toybox implementation), one
> of the awk tests fails against Android's "one true awk":

Ray replied to that, I haven't opened the can of worms of awk cleanup yet (wanna
cut a release with it in pending first) so I'll happily apply a fix from him if
he posts one and otherwise punt for now...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] gcc vs llvm -static-libasan

2024-07-09 Thread Rob Landley
On 7/9/24 12:35, enh wrote:
> On Tue, Jul 9, 2024 at 10:02 AM enh  wrote:
>> On Tue, Jul 9, 2024 at 8:47 AM Rob Landley  wrote:
>>
>> i'll forward this to the llvm asan folks anyway, in particular the
>> "gcc and llvm use different flags" is the kind of papercut that i
>> think both gcc and llvm have been trying harder to avoid lately.
> 
> https://github.com/llvm/llvm-project/pull/98194 for that.

Wunderbar. (I think it's related to the stroopwafel.)

When does your AOSP toolchain update in the main build so I can commit/push a
change using it for ASAN=1 builds without inconveniencing you?

> apparently your "static asan" request is even harder than i thought...

Last time I tried setting up a container we wound up debugging stdin handling in
bionic's _start code (although that _was_ because the laptop I was using at the
time had an old processor throwing illegal instruction errors, so I needed to
build a kernel and run it through qemu and used mkroot, which triggered some sad
initramfs behavior in the kernel.)

The current laptop seems to run the binaries so far. (I'm grinding through make
tests even, just static build without ASAN enabled...)

> the folks who actually know what they're talking about (because they
> implemented all of these things), fully static asan isn't a thing ---

It is for gcc/glibc, although I expect they're only implementing half the tests
the LLVM guys have...

> "bionic's hwasan support is a special well-integrated combination that
> works in static binaries, but everything else relies on symbol
> interposition and the presence of the dynamic loader".
> 
> so my default knee-jerk "have you tried hwasan?" was closer to the
> mark than i realized at the time :-)

Does qemu-system-aarrcchh6644 emulate hwasan? If so, how do I get a dynamic
bionic chroot of a reasonable size to boot under qemu to test binaries? (If the
NDK doesn't have an extractable sysroot dir, does AOSP have some sort of "lunch
shellprompt"?)

Or should I stick with the glibc one once I can check that in? (Dynamic linked
bionic would allow more of the utf-8 plumbing to work, and thus get tested by
me, but I have to finish replacing -lcrypt with lib/crypt.c before "make
defconfig" builds under bionic, so I'm busy here for a while yet...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] gcc vs llvm -static-libasan

2024-07-09 Thread Rob Landley
On 7/9/24 09:02, enh wrote:
> On Tue, Jul 9, 2024 at 8:47 AM Rob Landley  wrote:
>> Any suggestions?
> 
> none you're going to like, but...

I don't have to like it, I just have to make it work. :)

> you're basically checking all the "not supported" boxes here:

Welcome to my life.

> x86-64 (as opposed to arm64),

The android-ndk-r26d/toolchains/llvm/prebuild directory only has linux-x86_64
binaries. (The download page hass Platform "Linux 64-bit (x86)" as the only
Linux option. It can produce arm output, but doesn't provide a toolchain that
would run on an arm system, therefore a native compile-and-test cycle...

> asan (as opposed to hwasan),

Which wasn't available on x86-64 last I checked.

I could whip up a raspberry-pi alike if pushed, but if you don't publish an ndk
that runs there... (I could try building it from source again?)

Adding qemu system emulation into the compile and test cycle is on the todo
list. Adding qemu APPLICATION emulation is a recipe for weird corner cases in
system call translation and something I've generally tried to avoid because
system call translation is a categorically harder problem than emulating
hardware interfaces. (Containers leverage the existing kernel, based on work
done by OpenVZ since 1999, running the same host/guest binaries with all the
endianness and page size and magic constants and structure packing issues
guaranteed identical, and it STILL took 10 to get it load bearing.)

> static (as opposed to dynamic).

Again, running on a debian host. When they come out with debian-bionic I'll
happily debootstrap a chroot for testing.

Getting an android chroot to work as a development environment is the goal,
hence chicken-and-egg problem.

> i believe that arm64 hwasan static binaries "just work",
> and if they don't, that's something we'd actually be able to find the
> time to look at. until there's x86-64 hardware with whatever they're
> calling "x86-64 top byte ignore", though, you're stuck with asan, and
> that's been known to be mostly/somewhat broken on Android for years
> (with no-one to work on it).
> 
> why can't you run the dynamic version? i thought you'd upgraded your
> laptop?

I did a fresh OS install on a spare laptop of the same model, but it's still a
glibc system.

> is the new one also too old to run ~2012-era x86-64 binaries?

It runs the static binaries just fine, but it's a glibc system: "./toybox:
cannot execute: required file not found". (And then "ldd toybox" said
"/usr/lib/x86_64-linux-gnu/libm.so: invalid ELF header" which isn't _entirely_
surprising given that gnu/ldd is a bash script calling glibc but was another
"wha...?" moment I did not pursue further.)

I vaguely remembered you saying you could create a "/system" symlink into the
ndk somewhere, so I ran readelf on the binary which is requesting
/system/bin/linker64 and did a "find android-ndk-r26d -name linker64" and there
were no hits, so even setting up a dynamic chroot for it remains a todo list 
item.

> alternatively, build arm64 hwasan binaries and run them on your phone
> or orangepi?
> 
> i'll forward this to the llvm asan folks anyway, in particular the
> "gcc and llvm use different flags" is the kind of papercut that i
> think both gcc and llvm have been trying harder to avoid lately.

Thanks.

As long as ASAN works for me _somewhere_ I don't really care where (should be
the same bugs), and I mostly want all the tests to pass with ASAN before cutting
a release (which is due again: time flies like an arrow, fruit flies like an
apple, of the two time flies eating all your arrows sounds like the bigger
problem...).

But if I can't check in the "make it work with gcc on debian" flag without
breaking the Android build, I has a sad. :(

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] gcc vs llvm -static-libasan

2024-07-09 Thread Rob Landley
On 6/12/24 16:08, Rob Landley wrote:
> The new debian toolchain also broke gcc/glibc ASAN, complaining (at 
> runtime)
> "ASan runtime does not come first in initial library list; you should
> either link runtime to your application or manually preload it with
> LD_PRELOAD." which is that library ordering
> nonsense back to rear its ugly head again and I refuse to humor
> these INSANE ASSHOLES. If LLVM/bionic works without this, then it's
> NOT REQUIED, they're just really bad at it. Notice how the error message
> doesn't don't say which library to LD_PRELOAD if I _did_ want to fix it,
> it just refuses to work where the previous version worked. A clear regression.
> Which I'm late enough in reporting it's a fait accompli, and I'm in the wrong
> for not noticing their fuck-up in a timely manner. Far too late to start 
> making
> a fuss about it now...
> 
> (Is a required library not installed? I used "build-essential" instead
> of manually installing gcc and make precisely so it would scoop up that kind
> of nonsense... And it's complaining about library ORDERING, which is not
> supposed to be a thing when dynamic linking.)

So the local fix I've had for this is to add -static-libasan to CFLAGS in the
ASAN portion of scripts/portability.sh which fixes gcc. Note that I need this to
do a NON-STATIC ASAN=1 build, because otherwise I get the "runtime does not come
first" error above. And no, this is not a toybox issue:

$ gcc hello.c -fsanitize-address -O1 -g -fno-omit-frame-pointer
-fno-optimize-sibling-calls
$ ./a.out
==5942==ASan runtime does not come first in initial library list; you should
either link runtime to your application or manually preload it with LD_PRELOAD.

But I wanted to regression test -static-libasan against the NDK before checking
it in, and I just installed ndk-r26d on the new laptop to do that.

The first reason it doesn't work is that llvm chose -static-libsan instead. I
note that gcc does not accept "static-libsan" and llvm does not accept
"-static-libasan".

The SECOND reason it doesn't work is that a --static linked NDK toybox binary
with -static-libsan says "error: undefined sunbol: _DYNAMIC" referenced by
sanitizer_linux.cpp called from libclang_rt.asan-x86_64-android.a

To reproduce that, I extracted the ndk zip file into my home directory and did:

$ ln -s ~/android-ndk-r26d/toolchains/llvm/prebuilt/linux-x86_64/bin ~/llvm
$ echo -e '#!/bin/bash\n"$(dirname "$0")"/clang --target=x86_64-linux-android34
"$@"' > ~/llvm/llvm-cc
$ chmod +x ~/llvm/llvm-cc
$ cd ~/toybox/toybox
$ CROSS_COMPILE=~/llvm/llvm- LDFLAGS=--static make clean defconfig toybox

And then switched off SU and LOGIN in .config because of crypt (it's on the todo
list), but the rest built! But then adding ASAN=1, the NDK build broke, although
-static-libsan doesn't seem to affect that (it builds dynamic with it but I
can't run the result, and the build breaks static without it because the OTHER
asan libraries inappropriately assume dynamic linking).

I'd like to check in the -static-libasan at the end of the CFLAGS string for
ASAN to fix the debian build (on the theory ASAN doesn't work for me in the NDK
_anyway_), but I don't want to break the (dynamic) android build (which
complains it's an unknown flag with "do you mean libsan").

Any suggestions?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] test -ef -ot -nt (POSIX 2024)

2024-07-03 Thread Rob Landley
On 7/3/24 00:37, Oliver Webb via Toybox wrote:
> Looking over the new stuff in POSIX 2024, toybox already has most of the 
> stuff it specifies

I've been waiting for the html to go up on opengroup.org before considering it
"real", in part due to PDF being more awkward to deal with (eyestrain) and in
part due to a members-only spec behind a paywall with samizdat copies in the
wild not being something open source should bother with. Andrew Josey's 6/21
email https://www.mail-archive.com/austin-group-l@opengroup.org/msg12710.html
said it would take "up to a month" to publish the html.

That said, happy to get a jump on things if some subset of recent debian
features is now blessed by a third party...

> Excluding things like make which toybox doesn't have,

But should. I wonder if a posix-2024 make would actually build Linux From
Scratch packages?

> and gettext/msgfmt which from all
> the design documentation I've read Rob doesn't wanna add to toybox,

I consider them out of scope. When Aboriginal Linux built Linux From Scratch it
used https://landley.net/aboriginal/mirror/gettext-stub-1.tar.gz which NOPs out
the functions.

> These are the POSIX 2024 features toybox doesn't have:
> 
> dd iflag=fullblock

ok

> rm -d

Heh. Ok.

> tail -r (which from my checking coreutils doesn't have)

Which sadly means debian's man page doesn't either.

> test -ef -nt -ot
> 
> The attached patch adds in the test -ef -nt -ot
> 
> As for symbolic links:
> 
> $ test /bin/bash -ef /bin/sh && echo yes
> yes
> $ test /bin/bash -ot /bin/sh && echo yes
> yes
> 
> test is meant to follow symlinks _only_ when checking inodes with -ef

Applied locally, but you didn't update the help text. I can do that here if -nt
-ef and -ot have the same meanings as the 2019 debian "man 1 test" page...

Hmmm, -nt = newer than, -ot = older than, -ef is... how did "same" become e?
Equal? Equivalent? That sounds like contents, not "these are hardlinks to the
same dev/inode". Merriam Webster's thesaurus for "same" just gives those two E
words.

Expected, extracted, extruded, ectopic, electric, endemic, eloquent...

Bash "help test" says "True if file1 is a hardlink to file2". Can't argue...

Thanks,

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Heads up, posix 2024 dropped.

2024-07-02 Thread Rob Landley
I've been waiting for the HTML edition (which says "to follow soon") so I can
recombobulate the roadmap. (Not doing it with the PDF, although several people
have written up their own difference lists.)

https://www.opengroup.org/austin/

Previous web version does not have the forward link yet either:

https://pubs.opengroup.org/onlinepubs/9699919799/

But just so you know... :)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] lspci fixes for multi-controller hosts

2024-06-16 Thread Rob Landley
On 6/14/24 08:51, Dmitry Buzdyk wrote:
> Couple of patches for the systems with multiple PCIe controllers.
> In my case it has 2 controllers and output of the lspci (with patches applied)
> looks like this:
> 
> :00:00.0  [060400]:   [17cb:0113]
> :01:00.0  [0c0330]:   [1912:0014] (rev 03) xhci_hcd
> 0001:00:00.0  [060400]:   [17cb:0113]
> 0001:01:00.0  [060400]:   [1179:0623]
> 0001:02:01.0  [060400]:   [1179:0623]
> 0001:02:02.0  [060400]:   [1179:0623]
> 0001:02:03.0  [060400]:   [1179:0623]
> 0001:03:00.0  [04]:   [17cd:0202]
> 0001:04:00.0  [0c0330]:   [1912:0014] (rev 03) xhci_hcd
> 0001:05:00.0  [02]:   [1179:0220] tc956x_pci-eth
> 0001:05:00.1  [02]:   [1179:0220] tc956x_pci-eth
> 
> 
> 1. Fix "lspci -x" taking into account full PCI_PATH_NAME
> 2. Print full PCI_PATH_NAME to the output.

I applied the first one, but #2 changes the output to not match what debian's
lspci is doing. Is there a rule about when to print the extra info here? (If
there's more than one controller?)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] awk (was: strlower() bug)

2024-06-16 Thread Rob Landley
On 6/15/24 17:22, Ray Gardner wrote:
> On Wed, Jun 12, 2024 at 2:57 PM Rob Landley  wrote:
>>
>> On 6/11/24 16:56, Ray Gardner wrote:
>> > Elliot, thanks for the positive feedback on the docs, but I really
>> > wish you and Rob would try the program. I waited a while to see what
>> > Rob would have to say. He doesn't seem the sort to be at a loss for
>> > words, but ... nothing. Any idea why he's had nothing to say about an
>> > awk for toybox?
>>
>> Why are you asking Elliott?
> 
> He responded to my post; you didn't. I know you've had a lot on your plate
> lately with your move, selling the house, working on toysh. But you
> responded to most posts here since mine on 5/14.

I had the window open, but hadn't yet done the reading. :)

> After all you've written about awk, I was puzzled by your non-response,
> and inferred wrongly that it was intentional, so I thought asking you why
> you didn't respond to a post you intended to ignore would be ... not well
> received. I thought Elliot might have some insight there.

A reasonable amount of follow-up is fine. I _do_ drop the ball a lot.

>> Remember the "poke me a week later if I forget"?
> 
> No. But I dug into the archive, and find that you said that to Oliver in a
> post about toysh in March. But never about awk, or to me.
> (http://lists.landley.net/pipermail/toybox-landley.net/2024-March/030146.html)

Sigh, I keep thinking it's in the FAQ. (I should update the FAQ, but have like 8
half-finished updates to it already...)

>> I consider myself poked, somewhat passive-aggressively. :)
> 
> No passive aggression intended.

My fault for not documenting the expected procedure better.

> But you've been looking for an awk for at least 8 years, so I really
> thought you'd welcome one that's complete and written for toybox, with
> some tests and documentation.

I am very interested, yes.

I downloaded your repo, copied toybox/awk.c to toys/pending/awk, and built it.
It compiled. I grabbed awk.test and ran that, and it passed.

Didn't QUITE pass test_host:

awk: cmd. line:1: warning: escape sequence `\u' treated as plain `u'
FAIL: awk \u
echo -ne '' | "/usr/bin/awk" 'BEGIN{print "\u20\u255"}' < /dev/null
--- expected2024-06-16 08:36:12.147722288 -0500
+++ actual  2024-06-16 08:36:12.155722288 -0500
@@ -1 +1 @@
- ɕ
+u20u255

But eh, passed all the others (with VERBSOSE=all), close enough. (Adding
"utf8locale" to the test file didn't fix it, dunno what it's trying to do...)

*shrug* I'm happy to check it into pending as is, if you don't mind discarding
commit history. (Um, URL to the github commit I got it from maybe? The trees
haven't got the same base so there isn't an obvious "pull" option here...)

  $ git log toybox/awk.c toybox_awk_test/awk.test | grep '^commit ' | wc -l
  30

It's a _bit_ granular but toysh is way worse, and for that matter:

  $ git log --follow toys/*/sed.c | grep '^commit ' | wc -l
  110

Hmmm, maybe I can do something clever fiddly with fishing out
git format-patch entries, trimming them a bit and adjusting the paths, and "git
am" in the other tree...

>> I have the tab open, the reason I haven't looked at it yet is A) it's 4500
>> lines, B) in a thing I have WAY insufficient existing domain expertise in 
>> (but
>> multiple bookmarked tutorials and an entire book on somewhere).
> 
> It's really 3523 lines of non-blank non-comment code, measured with:
> 
> toybox awk -f cnt_sloc.awk awk.c
> 
> where cnt_sloc.awk is:
> /^[ \t]*\/\*/ , /\*\/[ \t]*$/ { next } # Skip /* ... */ comments
> /^ *$/ || /^ *\/\// { next } # Skip empty and //comment-only lines
> { sloc++ }
> END { print sloc }

Hmmm...

$ ./awk -f <(echo '/^[ \t]*\/\*/ , /\*\/[ \t]*$/ { next }'$'\n''/^ *$/ || /^
*\/\// { next }'$'\n''{ sloc++ }'$'\n''END { print sloc }') toys/*/awk.c
3523

> BTW regarding not getting an SSD at Target: there's a MicroCenter in the
> Minneapolis metro area; might be worth the drive. The one where I am is good.

I took the green line to the A line to Best Buy, which still had a few of the
right kind of ssd locked in a misc old parts filing cabinet. New(er) laptop is
up and running, with non-EOL os version installed on it. (Hence the list of
things the new environment/compiler broke.)

Part of my slow/quiet here is the old machine is still the "master" for email
and blog pushing, so I write notes-to-self and then have to copy them over when
that's the one I took out with me for the day. I'm slowly getting used to
firefox (dunno if it really scales yet, "pkill -f renderer" at chrome hasn't got
an obvious equivalent that leaves the tab open and reloadable). I also have to
decide if it's gonna be Thunderbird again or something else, which is presumably
bundled 

Re: [Toybox] awk (was: strlower() bug)

2024-06-16 Thread Rob Landley
Sigh, composed a reply on the other laptop but still can't send it from there...

On 6/12/24 16:09, enh wrote:
> On Wed, Jun 12, 2024 at 4:57 PM Rob Landley  wrote:
> yeah, i make a lot of noise, but i don't have any real power :-)

Nah, you just have bigger fish to fry. :)

> (fwiw, there's a second edition just come out. from one-true-awk it
> seems like csv support is the main new feature. amusingly one of the
> errata for the new edition on https://awk.dev/ is a behavioral
> difference between one-true-awk and gawk.)

I tend to drill down fairly deeply into things to be comfortable, and people
have implemented fairly large things in awk as a programming language.

>> The new debain toolchain is hallucinating a warning when I build toybox
>> with it, toys/posix/grep.c:211:24: warning: 'regexec0' accessing 8 bytes
>> in a region of size 4 [-Wstringop-overflow=]
>> and futher note: referencing argument 5 of type 'regmatch_t[0]'.

1) This only happens for ASAN=1 by the way. (Yes, at compile time.)

2) Changing the prototype from regmatch_t pmatch[] to regmatch_t *pmatch made
the warning go away.

*jazzhands*

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] awk (was: strlower() bug)

2024-06-12 Thread Rob Landley
On 6/11/24 16:56, Ray Gardner wrote:
> Elliot, thanks for the positive feedback on the docs, but I really
> wish you and Rob would try the program. I waited a while to see what
> Rob would have to say. He doesn't seem the sort to be at a loss for
> words, but ... nothing. Any idea why he's had nothing to say about an
> awk for toybox?

Why are you asking Elliott? He's in California, I'm currently in Minneapolis, we
haven't spoken in person since before the pandemic. (I mean, he has my cell
phone number and can text me, but it doesn't come up much?)

Remember the "poke me a week later if I forget"? I consider myself poked,
somewhat passive-aggressively. :)

I have the tab open, the reason I haven't looked at it yet is A) it's 4500
lines, B) in a thing I have WAY insufficient existing domain expertise in (but
multiple bookmarked tutorials and an entire book on somewhere).

I'm currently re-familiarizing myself with the toysh redirection code to fix a
nasty bug there, which has been a todo item I haven't finished for 3 weeks, and
in the past week I installed a new laptop with a non-EOL version of devuan, and
setting that up I found multiple things that the world moving on without me 
broke.

Sigh, my blog's way behind, just updated it to the 21st, but here's blog
spoilers which probably do not render properly as html yet and doesn't have
links to things I mention (the patch mentioned at the end is
https://landley.net/bin/mkroot/latest/linux-patches/0008-elfcrap.patch and no I
haven't fixed it yet):

June 7

Stuff's a bit chopped up since I'm straddling two laptops. Still blogging
from the old one, and the old one has the reasonable battery (I should order
another battery) so I can't take the new one out to random coffee shop
yet but only use it plugged in at the desk. So I'm blogging about what I did
based on a notes.txt file I scp'd over to the old machine.

Package dependencies remain out of control: for some reason "apt-get install
git" wanted "libperl-error" which is just sad. I'm vaguely annoyed that
build-essential installed fakeroot and three *-perl packages and so on,
but that's the cost of using a meta-package somebody else curates.
(Saying "the following additional packages will be
installed" and then "the following NEW packages will be installed" with
the only difference being the second one incldues the package I requested...
that seems non-optimal, especially when the list is 37 packages long).

The new debain toolchain is hallucinating a warning when I build toybox
with it, toys/posix/grep.c:211:24: warning: 'regexec0' accessing 8 bytes
in a region of size 4 [-Wstringop-overflow=]
and futher note: referencing argument 5 of type 'regmatch_t[0]'.
This warning is wrong in multiple ways.

First the code's been run under ASAN
a lot without complaint, and no other toolchain produces this warning: not llvm,
not gcc, and musl-cross-make has been building the same gcc 12.0 version which
does NOT produce the warning. Something debian locally patched into its
"gcc 12.0-14" is producing a warning that vanilla gcc does not produce.
That makes it a bit suspicious to begin with.

I inspected the code anyway, and argument 5 of the call to regexec0() in
do_grep() is an 8 byte pointer to a 16 byte structure. There's no "region
of size 4" to be found. The argument shoe->m is a pointer
to an entry of type regmatch_t, and that struct contains two entries of
regoff_t which is ssize_t which is long, thus 16 bytes on a 64-bit system.
Even on a 32 bit system, the two of them would still add up to 8 bytes.
The structure is allocated to its full size. There's nothing wrong with the
code that I've been able to spot.

I _think_ what might be happening is shoe->m lives in "shoe" which is
most recently assigned to in the enclosing for() loop via
shoe = (void *)TT.reg; and TT.reg in the GLOBALS() block is declared as
struct double_list *reg; because at that level we only care that it's
a doubly linked list, not what members each list entry has in
the command-local "struct reg". Except even THAT theory is funky because
double_list has three pointers: next, prev, and data, each of which is 8 bytes,
where is it getting size 4? If it was comparing sizeof(*TT.reg) with
sizeof(*shoe) then shoe-m starts off the end of the smaller struct.
If the compiler can't keep the types straight then it's not a size 4 issue,
it's an out of bounds access.

The type of the "shoe" pointer is "struct reg", which has 5 members.
The argument it's complaining about is a pointer to the 5th member, which
is indeed a regmatch_t. (And the error is SAYING it's a regmatch_t, which
is neither 4 nor 8 bytes long, it's 16. Neither the pointer, not the struct,
nor any member OF that struct, match the constraint it's insisting was
violated.)

The only place there's a member of size 4 is "int rc", the third member
of struct reg. And struct double_list only HAS 3 members, and "m" is
the last member struct reg, so maybe somehow the compiler is confusing
(struct reg 

Re: [Toybox] [PATCH] vi: rename `-s` flag to `-c`

2024-06-12 Thread Rob Landley
On 6/11/24 07:50, enh wrote:
>> Looking further at this, what is the behavioral difference between -c and -s?
>> The patch does nothing but change one into the other, with no other behavior
>> change I've spotted?
...
>> I thought for a moment that -c was jumping straight into esc-colon mode with 
>> the
>> command line at the bottom of the screen, but the -c example above does not
>> provide -c on the command line and I am just CONFUSED.
...
> i think the objection is quite simple, no? -c takes a _command_
> whereas -s takes a _filename_ (and that file is full of commands).
> what's currently implemented _is_ -s, and it would be wrong to rename
> it to -c.

Ok, so -c _does_ jump to esc-colon mode and -s doesn't, and the patch isn't
changing the behavior. Got it.

(Typed that, then didn't send it and instead opened vi.c instead, rewrote most
of the help text, gave up on making sense of the vim man page and instead
started reading the posix vi page, which delegates to the ex page...)

CAN OF WORMS.

No idea how to cleanly explain that "a is like i but moves the cursor one
character to the right first, for no obvious reason". I mean, does that come up
a lot? And "A is END then i". Design question of whether to try to explain to
someone how to use this (assuming they have cursor keys and thus don't need
weird combos like that), or is there a duty to list what we've implemented?
(Which in general has _not_ been toybox policy: the help text doesn't mention
things we included for compatibility with old scripts/habits that are never the
obvious answer to "how do I do X", there should be one obvious way to do it and
was before Guido retired... But that's a heck of an editorial judgement, isn't 
it?

Sigh, over the years I've had various vi cheatsheets (and once taught an intro
to unix course at the local community college which had a week on vi), but if I
still have any they're packed in a box in storage after the move. This command
is the poster child for "each user uses a subset of the options, but no two
quite agree on WHICH subset..."

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] vi: rename `-s` flag to `-c`

2024-06-11 Thread Rob Landley
On 6/7/24 03:41, Rob Landley wrote:
> On 6/5/24 00:46, Jarno Mäkipää wrote:
>> You cannot test against other vi clones with -c after this patch. But
>> you could have test against vim with -s {script} implementation. I
>> used vim as reference for testing with original test case files.
> 
> If I'm cloning bash specifically for toysh, I don't have an objection to
> targeting vim specifically in a toybox vi implementation.
> 
>> Ex command only switch -c could be added as  addition to -s if you
>> wanna achieve something with ex commands, but maybe dont delete -s
>> implementation, unless you have better way to test vi mode motions.
> 
> Right now the regression test contexts I'm paying attention to are busybox
> defconfig and debian's default install. How does this impact testing against 
> those?

Looking further at this, what is the behavioral difference between -c and -s?
The patch does nothing but change one into the other, with no other behavior
change I've spotted?

The vim man page says:

   -c {command}
   {command} will be executed after the first  file  has  been
   read.   {command}  is interpreted as an Ex command.  If the
   {command} contains spaces it must  be  enclosed  in  double
   quotes  (this depends on the shell that is used).  Example:
   Vim "+set si" main.c
   Note: You can use up to 10 "+" or "-c" commands.

   -s {scriptin}
   The script file {scriptin} is read.  The characters in  the
   file  are  interpreted  as if you had typed them.  The same
   can be done with the command ":source! {scriptin}".  If the
   end of the file is reached before the editor exits, further
   characters are read from the keyboard.

I thought for a moment that -c was jumping straight into esc-colon mode with the
command line at the bottom of the screen, but the -c example above does not
provide -c on the command line and I am just CONFUSED.

Jarno: what "other clones" were you referring to? Posix has -c, and does not
have -s. Are we supporting non-posix clones other than vim? (Is there a default
freebsd or macos version that ignores posix, maybe?)

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/vi.html

I do not have the domain expertise to understand the objection here.

> Rob

Still Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] vi: rename `-s` flag to `-c`

2024-06-07 Thread Rob Landley
Sorry I'm a bit slow, I'm setting up a new laptop and haven't got email moved
over yet. (Cutting the gordian knot of closing all the windows on the old laptop
by just not waiting for that, since I _do_ have a spare laptop and bought a new
hard drive for the new install anyway...)

On 6/5/24 00:46, Jarno Mäkipää wrote:
> You cannot test against other vi clones with -c after this patch. But
> you could have test against vim with -s {script} implementation. I
> used vim as reference for testing with original test case files.

If I'm cloning bash specifically for toysh, I don't have an objection to
targeting vim specifically in a toybox vi implementation.

> Ex command only switch -c could be added as  addition to -s if you
> wanna achieve something with ex commands, but maybe dont delete -s
> implementation, unless you have better way to test vi mode motions.

Right now the regression test contexts I'm paying attention to are busybox
defconfig and debian's default install. How does this impact testing against 
those?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Fix ionice's return value for getting process IO priority

2024-06-01 Thread Rob Landley



On 5/30/24 07:23, enh via Toybox wrote:
> [note: this isn't my patch; it was
> https://android-review.googlesource.com/c/platform/external/toybox/+/3106282,
> and i'm just forwarding it. the attachment is the original patch with
> the original author's details, and i've cc:ed them. lgtm to me,
> though, and matches the next function in the same file.]
> 
> In the user version, if you use ionice to get the process IO
> priority without permission, -1 will be returned, but Idle: prio 7
> will be printed at this time. This is an incorrect priority and
> should return permission denied.

Applied, and then I did 70a5259261ea cleanup on top of it (move the error
handling into the syscall wrappers, renamed with x prefix, and clean up a couple
printfs that didn't need to assign back to the global variables while I was
there), which I don't THINK broke anything but I don't have a particularly
thorough way to test it here?

I set the ioprio of a backgrounded "sleep" command, and it read back what I'd
sent, but debian's and toybox's commands varied a bit:

  $ sleep 1000 &
  [1] 18361
  $ ./ionice -p 18361
  unknown: prio 0
  $ ionice -p 18361
  none: prio 0
  $ ./ionice -p 18361 -c 3 -n 2
  $ ./ionice -p 18361
  Idle: prio 2
  $ ionice -p 18361
  idle

Looks ok to me?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] strlower() bug

2024-06-01 Thread Rob Landley
On 5/31/24 12:53, enh wrote:
>> Let's see... Ah:
>>
>> https://www.unicode.org/L2/L1999/UnicodeData.html
>>
>> That's a bit long. My suggestion had 9 decimal numbers, this has "IDEOGRAPHIC
>> TELEGRAPH SYMBOL FOR JANUARY" as one of fifteen fields, with " 0031
>> 6708" being another single field. How nice. (And still extensive warnings 
>> that
>> this doesn't cover everything. I think "too much is never enough" was an MTV
>> slogan back in the 1980s? Ah, it's from "The Marriage of Figaro" in 1784.)
> 
> citation needed? (or if you want me to keep trying to think of where
> that or something similar occurs in the libretto, at least tell me
> whether it's an aria or recitative :-) )

Sorry, not the Mozart one. And not the Italian one Mozart based his version on,
but the original french version the Italian one was based on:

https://en.wikipedia.org/wiki/The_Marriage_of_Figaro_(play)

The quote gets translated a few ways out of the 300 year old french:

https://www.oxfordreference.com/display/10.1093/acref/9780191826719.001.0001/q-oro-ed4-0807

And to clarify again, I mean Wolfgang, not his equally (if not more) talented
sister Maria who toured together with her sibling as child prodigies but was
sidelined as soon as she reached "marriageable age" and had to teach piano for a
living:

https://en.wikipedia.org/wiki/Maria_Anna_Mozart

Some letters from Wolfgang praising her compositions have survived, but her
parents destroyed all her actual sheet music because it had cooties. Next time
people talk about the "great men of history"... Don't get me started about
Einstein's first wife.

>> In ascii, wcwidth() is basically isprint() plus "tab is weird".
>>
>> For unicode, wcwidth() comes into play. The unicode bureaucracy committee 
>> being
>> too microsofted to competently provide one is irrelevant to wcwidth() not 
>> being
>> needed for ascii.
>>
>> (I also note the assumption of monospaced fonts in all this. Java's
>> fontmetrics() was about measuring pixel counts in non-monospaced fonts, which
>> this doesn't even contemplate.)
> 
> this is why i keep telling you that wcwidth() only really makes sense
> for tty-based stuff. and even there ...

I need to figure out where to wrap lines in command line editing and text
editors and so on. (I have been relieved of duty on vi, but I still need to make
shell command line editing work. Plus fold and so on. And screen, and watch.
Might do a nano-alike at some point. This is already sort of in top...)

> i'm curious whether the
> different terminal emulators actually behave the same in any of the
> interesting cases. (_especially_ when you get to the "that can't
> happen in well-formed text in the language that uses that script"
> cases.)

I have an ANSI probe sequence to ask where the cursor is, but even if I wanted
to be that chatty (and didn't mind that the amount of time it takes to get a
response is arbitrary and variable, with no response actually guaranteed to come
anyway, and other input surrounding the response), if the output's already
wrapped and scrolled the screen since the last time I asked it's bad. And if I
_disable_ screen wrap then A) I dunno if it's truncated the output, B) lots of
other stuff breaks (it's like leaving the screen in raw mode, only SUBTLY wrong,
and yes QEMU does this from time to time and drives bash line editing NUTS,
that's why run-qemu.sh echoes the relevant stop doing that sequence AND mkroot's
init also outputs it)...

Which means I need a wcwidth() to know how many columns the next character will
advance the cursor in the terminal before outputting it.

>> Not that I particularly want to ship a large ascii table either. When I dug 
>> into
>> musl's take on this, I was mostly reverse engineering their compression 
>> format
>> and then going "huh, yeah you probably do want to compress this".
>>
>> I could generate the table I listed with a C program that runs ispunct() and
>> similar on every unicode code point and outputs the result. I could then 
>> compare
>> what musl, glibc, and bionic produce for their output. The problem is it's 
>> not
>> authoritative, it's downwind of the "macos is still using 2002 data" issue 
>> that
>> keeps provoking this. :(
> 
> i'm really confused that you keep mentioning ascii. if you really mean
> ispunct() here, say, and not iswpunct(),

The difference between them is that ispunct() has always taken an int but the C
committee was cowardly and refused to make it actually respond to the whole
range, so they created a new function to do the same thing.

At least fseeko() can blame LP64 for long and pointer being the same size having
splash damage. (Moore's Law didn't advance the components in a coordinated
manner, we hit the need for >2 gig files ten years before we hit the need for
>4gig system RAM and thus 64 bit registers...)

(I suppose the C committee was fighting IBM and Microsoft for 10 years before
utf8 happened, and then the unicode committee had Microsoft on it and thus
combining 

Re: [Toybox] strlower() bug

2024-05-31 Thread Rob Landley
On 5/30/24 16:12, enh wrote:
>> > hmm... looking at Apple's online FreeBSD code, it looks like they have
>> > very different (presumably older) FreeBSD code
>> > [https://opensource.apple.com/source/Libc/Libc-320.1.3/locale/FreeBSD/tolower.c.auto.html],
>> > and the footer of the file that reads implies that they're using data
>> > from Unicode 3.2 (released in 2002, which would make sense given the
>> > 2002 BSD copyright date in the tolower.c source):
>>
>> Sigh, can't they just ship machine consumable bitmaps or something?
> 
> because everyone wants different formats. even the same library has
> changed over time. (and not just because characters went from 16 bits
> to 21 bits!)

Conversion from a simple format seems straightforward to me.

Part of my frame of reference here is Tim Berners Lee inventing the 404 error.
That was Tim's big advance that made HTML work where Ted Nelson's overdesigned
hyper-cyber-iText didn't. Tim 80/20'd the problem by just handling the easy
cases (we have the data) and punting the hard cases (updating links when they
moved) to humans.

Ted published his hyper-hype paper in 1965 and then failed to interest anyone in
it for a quarter century before Tim made something actually useful (beating
Gopher by about 6 months). Crediting Ted as the inventor of html is like
crediting Jules Verne as the inventor of the submarine, or H.G. Wells as the
(eventual) inventor of the time machine. (Lazerpig had a rant about this in his
video on stealth planes: the inventor is the person who made it WORK, not who
came up with the idea of humans flying or a knob on the wall that controls the
air temperature.)

So to me, the question is "how much can we put in a simple format", and then
have a list of broken characters you need an exception handler function for. How
do we 80/20 this?

>> I can have
>> my test plumbing pull "standards" files, ala:
>>
>> https://github.com/landley/toybox/blob/master/mkroot/packages/tests
>>
>> But an organization shipping a PDF or 9 interlocking JSON files with a turing
>> complete stylesheet doesn't help much.
> 
> (not really the point, but the one you want for the stuff you're
> talking about here is actually just a text file.

Let's see... Ah:

https://www.unicode.org/L2/L1999/UnicodeData.html

That's a bit long. My suggestion had 9 decimal numbers, this has "IDEOGRAPHIC
TELEGRAPH SYMBOL FOR JANUARY" as one of fifteen fields, with " 0031
6708" being another single field. How nice. (And still extensive warnings that
this doesn't cover everything. I think "too much is never enough" was an MTV
slogan back in the 1980s? Ah, it's from "The Marriage of Figaro" in 1784.)

aosp/external/icu/icu4j/main/tests/core/src/com/ibm/icu/dev/data/unicode/UnicodeData.txt
aosp/external/icu/android_icu4j/src/main/tests/android/icu/dev/data/unicode/UnicodeData.txt
aosp/external/icu/icu4c/source/data/unidata/UnicodeData.txt
aosp/external/pcre/maint/Unicode.tables/UnicodeData.txt
aosp/external/cronet/third_party/icu/source/data/unidata/UnicodeData.txt
aosp/out/soong/workspace/external/cronet/third_party/icu/source/data/unidata/UnicodeData.txt

Android seems to have checked in multiple copies of this file.

$ for i in $THAT; do [ -n "$OLD" ] && diff -u $OLD $i; OLD=$i; done | grep +++
+++ aosp/external/pcre/maint/Unicode.tables/UnicodeData.txt 2023-08-18
15:16:31.239657629 -0500
+++ aosp/external/cronet/third_party/icu/source/data/unidata/UnicodeData.txt
2023-08-18 15:14:44.351661450 -0500

And I need to re-pull my tree for them to match.

> i've repeatedly been
> tempted to teach unicode(1) to read it, since it's always installed on
> macOS and debian anyway [for values of "always" that include "all my
> machines, anyway"], to be able to show far more information about any
> given character.)

I've thrown a note on the todo heap...

>> Which is _sad_ because there's only a dozen ispunct() variants that read a 
>> bit
>> out of a bitmap (and haven't significantly changed since K: neither 
>> isblank()
>> nor isascii() is worth the wrapper), plus a toupper/tolower pair that map
>> integers with "no change" being the common case.
> 
> (one of the things you'll learn from parsing the file is that that's
> not how toupper()/tolower() works for all characters. plus there's
> titlecase. plus case folding.)

"For all characters". I'm just looking for low hanging fruit and a list of
exceptions to punt to a function.

>> Plus unicode has wcwidth().
> 
> no, it doesn't. (i wouldn't be maintaining my own if it did!)

In ascii, wcwidth() is basically isprint() plus "tab is weird".

For unicode, wcwidth() comes into play. The unicode bureaucracy committee being
too microsofted to competently provide one is irrelevant to wcwidth() not being
needed for ascii.

(I also note the assumption of monospaced fonts in all this. Java's
fontmetrics() was about measuring pixel counts in non-monospaced fonts, which
this doesn't even contemplate.)

>> So code, alpha, cntrl, digit, punct, space, width, upper, 

Re: [Toybox] this week in weird coreutils stuff: chmod

2024-05-31 Thread Rob Landley
On 5/30/24 14:59, enh wrote:
>> *shrug* Removing all uses of mode_t and using "unsigned" instead consistently
>> should work fine. Only "struct stat" should really care, and even then they
>> could just use the actual primitive type in the struct definition...
>>
>> (I'm not a fan of data hiding without some _reason_ for it. I used to humor
>> it a lot more, but now I want to know what/why it's doing.)
>
> funnily enough, i'm having exactly this argument with the person who
> asked for this chmod functionality, since they own the ABI checker,
> and i'm claiming that telling me that i've "changed" u_int32_t to
> uint32_t is not helpful, and that when talking about ABI i always want
> the underlying type :-)

LP64 remains a good idea. Pity the ANSI C committee for C89 didn't have a spine.

(Yeah they had to navigate the 16->32 bit transition, but LP32 for 32-bit ANSI C
systems would have made OBVIOUS SENSE. "There are older systems that aren't
LP32" was a given. The 68k came out in 1980 and the 386 in 1985, they weren't
exactly taken by surprise by 32 bit registers and a flat memory model...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] microcom, stty: Use TCSADRAIN to set tty device attribute

2024-05-31 Thread Rob Landley
On 5/20/24 23:43, Yi-Yo Chiang via Toybox wrote:
> Don't flush the tty device input queue when setting device attribute.
> 
> Rationale:
>   * microcom: The tty device might already have a _good enough_ termios
>     to read data from. Flushing the input queue just to set the terminal
>     attribute would result in data loss in this case.
>   * stty: This command doesn't read nor write data to the tty device, so
>     flushing the input queue doesn't make sense here. The program
>     actually talking to the tty should decide if it wants the tty
>     flushed or not.
>     Note: other implementations of stty also uses TCSANOW (bsd) or
>     TCSADRAIN (coreutils), so I think this assumption is fine.

Was commit 2043855a4bd5 sufficient or do you still have a use case that needs
this? (Email dated the 20th containing a patch dated the 21st vs a git commit
applied the 23rd...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] this week in weird coreutils stuff: chmod

2024-05-30 Thread Rob Landley



On 5/29/24 14:20, enh wrote:
> seems to have broken the macOS build?
> ```
> lib/lib.c:953:10: error: conflicting types for 'string_to_mode'
> unsigned string_to_mode(char *modestr, unsigned mode)
>  ^
> ./lib/lib.h:413:10: note: previous declaration is here
> unsigned string_to_mode(char *mode_str, mode_t base);
>  ^
> ```

Oops, missed one. Try commit 3c276ac106a4.

So what _is_ mac using... Sigh:

/Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/include/sys/_types/_mode_t.h:typedef
__darwin_mode_t mode_t;
/Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/usr/include/sys/_types.h:typedef
__uint16_t  __darwin_mode_t;/* [???] Some file attributes */

They typedef it to unsigned short instead of unsigned int. Even though type
promotion will pass an int on the stack for anything smaller than an int, and
use an int register to do the math...

I guess back in 1974 "int" was a 16 bit type, and they stuck with that in the
move to 32 and then 64 bit processors because SUGO times 3 bits each is only
using 12 of those 16 bits, leaving 4 for file types and we've only used 7 of
those 16 combinations for IFDIR and IFBLK and so on (well, 8 on mac but the
header says IFWHT is obsolete), clearly that will never run out...

*shrug* Removing all uses of mode_t and using "unsigned" instead consistently
should work fine. Only "struct stat" should really care, and even then they
could just use the actual primitive type in the struct definition...

(I'm not a fan of data hiding without some _reason_ for it. I used to humor it a
lot more, but now I want to know what/why it's doing.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] this week in weird coreutils stuff: chmod

2024-05-29 Thread Rob Landley
On 5/28/24 08:00, enh via Toybox wrote:
> apparently chmod allows something like
> 
>   chmod u+rwX-s,g+rX-ws,o+rX-wt
> 
> as a (far less readable!) synonym for
> 
>   chmod u+rwX,u-s,g+rX,g-ws,o+rX,o-wt
> 
> i'm told that toybox silently accepts the former too, but does not
> interpret it as if it means the latter?

Try commit a2c4a53e155c.

(Needed to zero a variable inside the loop rather than just once at the
beginning. Random cleanups while I was there, plus tests.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] strlower() bug

2024-05-29 Thread Rob Landley
On 5/22/24 09:30, enh wrote:
> On Tue, May 14, 2024 at 2:58 PM Rob Landley  wrote:
>> It looks like macos towlower() refuses to return expanding unicode 
>> characters.
>> Possibly to avoid exactly the kind of bug this fixed, in exchange for 
>> corrupting
>> the data.
> 
> yeah, i don't know whether it's on purpose or a bug, but that does
> seem to be the case... i tested with another Latin Extended-B
> character whose uppercase and lowercase forms are both in the same
> block (and thus have the same utf8 encoding length), and macOS
> towlower() does work for that.
> 
> hmm, actually maybe it's just that their Unicode data is out of date?
> it looks like they don't know about Latin Extended-C at all? a code
> point like U+2c62 that gets _smaller_ (because it's in the IPA
> Extensions block) doesn't work either.
> 
> i did try looking in FreeBSD, but i've never understood how this stuff
> works there.

FreeBSD questions go to Ed Maste  who is theoretically
subscribed here but keeps getting unsubscribed by gmail bounces.

> i'm guessing from the fact i've never found them that the
> implementations are all generated at build time, subtly enough that my
> attempts to grep for the generators fail.
> 
> hmm... looking at Apple's online FreeBSD code, it looks like they have
> very different (presumably older) FreeBSD code
> [https://opensource.apple.com/source/Libc/Libc-320.1.3/locale/FreeBSD/tolower.c.auto.html],
> and the footer of the file that reads implies that they're using data
> from Unicode 3.2 (released in 2002, which would make sense given the
> 2002 BSD copyright date in the tolower.c source):

Sigh, can't they just ship machine consumable bitmaps or something? I can have
my test plumbing pull "standards" files, ala:

https://github.com/landley/toybox/blob/master/mkroot/packages/tests

But an organization shipping a PDF or 9 interlocking JSON files with a turing
complete stylesheet doesn't help much.
> so, yeah, i don't think there was anything clever or mysterious going
> on here --- macOS is just using Unicode data from 22 years ago. (which
> is an amusing real-world example of why i keep saying "you probably
> don't want to get into the business of redistributing Unicode data; it
> changes every year" :-) )

A youtuber named Ryan McBeth is fond of explaining the difference between a
"problem" and a "dilemma". A problem has an obvious solution, which may be
painful or expensive but there's not a lot of disagreement on what success looks
like. A dilemma has multiple ways to address it, each of which has something
uniquely wrong with it. Problems don't lead to indecision, dilemmas do (and thus
accumulate).

In this case, the dilemma is "trusting libc to get it wrong differently in each
new environment" vs "taking a large expense onboard with borderline xkcd
violation". (If there is an xkcd strip explaining why not to do something, you
probably shouldn't do it. In this case https://xkcd.com/927/ )

Which is _sad_ because there's only a dozen ispunct() variants that read a bit
out of a bitmap (and haven't significantly changed since K: neither isblank()
nor isascii() is worth the wrapper), plus a toupper/tolower pair that map
integers with "no change" being the common case. Plus unicode has wcwidth().
Yes, it's over a (sparse!) table with space for a million entries, but CSV
encoding all that data in human+machine readable ASCII should gzip down to what,
500k?

Let's see, the bits seem to be alpha, cntrl, digit, punct, and space, and then
width (mostly 0, 1, or 2 but we've talked about exceptions), and two translation
codepoints for toupper and tolower.

You can easily derive isalnum() and isxdigit(), and isascii() and isblank() are
trivial according to the man page. If the table has upper and lower mappings
(I.E. what character this turns into, zero if it doesn't) then you don't need
isupper() or islower() bits unless there's cases where "this isn't upper case
but can be converted to lower case" (which aren't covered by having BOTH
toupper() and tolower() mappings for the same character).

I'm honestly unclear on what "isgraph" does, "any printable character except
space"... if isprint() means "not width 0" then that's just adding && !isspace()
so doesn't need to be in the table.

So code, alpha, cntrl, digit, punct, space, width, upper, lower. Something like:

0,0,0,0,0,0,0,0,0
13,0,1,0,0,1,0,0,0
32,0,0,0,0,1,1,0,0
57,0,0,1,0,0,1,0,0
58,0,0,0,1,0,1,0,0
65,1,0,0,0,0,1,0,97

No, that doesn't cover weird stuff like the right-to-left gearshift or the
excluded mapping ranges or even the low ascii characters having special effects
like newline and tab, but those aren't really "characters" are they? Special
case the special cases, don't try to represent them 

Re: [Toybox] microcom.c discarding data due to TCSAFLUSH

2024-05-23 Thread Rob Landley
On 5/20/24 09:42, Yi-Yo Chiang via Toybox wrote:
> Is there any particular reason to use TCSAFLUSH here?

Partly because it's what strace said busybox and minicom were doing, and partly
because I've had serial hardware that produced initial static on more than one
occasion.

In this case, it looks like Elliott also put it in his initial contribution
(commit 12fcf08b5c96).

> If not, can we change to TCSADRAIN or TCSANOW. I don't think there is good
> reason to _discard received data_ just to set the terminal mode...? Is there
> really a real world case that the device termios is so dirty that all data, 
> from
> before setting raw mode, must be discarded?

I've seen multiple instances where there was initial noise from the port going
live before the speed stabilized, or static from a physical connection plugging
in or powering up, or truncated bootloader messages that filled up the input
buffer then abruptly cut off.

> I also tried to modify the microcom code to skip tcsetattr() if the device
> termios is already equal to the mode we are setting it.
> `if (old_termios != new_termios) tcsetattr(new_termios, TCSAFLUSH)`
> However this doesn't work because microcom always tries to set the device 
> baud.

Hmmm, you're right, it shouldn't mess with that unless we specify -s. I could
also make TCSAFLUSH only happen when we do -s (because otherwise it's an
existing connection and we're not messing with it, but I still need to make sure
it's in raw mode)...

Note: FLAG(s)*TCSAFLUSH becomes 0 (TCSANOW) in the absence of -s.

> For example a pty device might be configured to use buad 38400,

Why set the baud at all on a pty? A pseudo-terminal doesn't have a baud rate,
leave it alone. (You can also inherit a serial port that was set up by the
bootloader and should again just be left alone...)

> but microcom
> would want it to be 115200, thus flushing it's data. but pty doesn't really 
> care
> about the baud most of the time AFAIK, so flushing data in this case just 
> seems
> disruptive to the user experience.

Setting baud rate and flushing are two different switches in the interface, but
in this case flushing only when setting the baud rate seems a good use of the
existing controls.

Try commit 2043855a4bd5

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] netcat: clarify documentation.

2024-05-22 Thread Rob Landley
On 5/20/24 07:06, enh via Toybox wrote:
> "Collate" means "sort", but -O is like -o other than buffering.

It means "group". (The dictionary says "gather or arrange in the proper
sequence".) I can see the confusion, but the collate button on the copier in
high school stapled pages together (the pages came out the same order either
way, the question was should the groups be attached). I also had a data entry
work-study job in college where I had to "collate" reports (basically doing the
same thing by hand, except using transparent file folders and this little
plastic strip that slid along the edge to hold the pages in).

We just had a thread about "buffering", and I find that _less_ illuminating in
context.

Sigh, "grouped", "streamed", "together", "terse", "what I thought it was doing
until I actually compared the output side by side", "showing the actual data
instead of the transaction boundaries that survived the nagle algorithm",
"assembled", "congregated", "collected", "packaged", "declutered", 
"thesaurus"...

How does "packed" sound?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] xputs: Do flush

2024-05-22 Thread Rob Landley
On 5/20/24 08:32, Yi-Yo Chiang wrote:
> Thanks! Adding TOYFLAG_NOBUF worked.
> 
> I feel the same way about "manual flushing of the output buffer is a terrible
> interface". I asked myself the question "Why am I manually flushing so much?
> There must be a better way..." multiple times when I wrote the other patch 
> that
> does s/xprintf/dprintf/, s/xputs/xputsn/

It's an annoying design quandry.

> > Your other patch changes a bunch of xprintf() to dprintf() which is even
> _more_
> > fun because dprintf() writes directly to the file descriptor (bypassing 
> the
> > buffer in the libc global FILE * instance "stdio"), which means in the 
> absence
> > of manual flushing the dprintf() data will go out first and then the 
> stdio
> data
> > will go out sometime later, in the wrong order. Mixing the two output 
> formats
> > tends to invert the order that way, unless you disable output buffering.
> 
> Which is the reason I replaced those all with the "flush" functions (xputsn) 
> or
> direct fd-write functions (dprintf), so that their order won't shuffle.
> Anyway the problem is moot now that we have TOYFLAG_NOBUF.

Eh, not moot. Shifted. Currently there's one command using TOYFLAG_NOBUF, and a
lot of recent buffering fixes:

ea119151ccc5
59b041d14aec
afeed2d46a9a
a57e42a386b0
ca6bde9e1c43

I should probably audit all the commands and figure out which buffering type to
use for each. (grep currently finds manual fflush() in hexedit, login, watch,
and ps).

But not today...

> > But that hasn't been popular, and it's a pain to implement in userspace 
> > (because
> > we don't have access to mulitple cheap timers like the kernel does, we need 
> > to
> > take a signal and there's both a limited number of signals).
> 
> do you run on anything that doesn't have real-time signals? i was
> going to say that -- since toybox is a closed world -- you could just
> use SIGUSR2, but i see that dhcp is already using that! but if you can
> assume real-time signals, there are plenty of them...

Within toybox I could probably come up with something, true. Although fflush()
locking is still a bit problematic if I'm not depending on thread
infrastructure. (Either I don't use FILE * and do it myself, or I require libc
to be thread aware.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] xputs: Do flush

2024-05-22 Thread Rob Landley
On 5/20/24 07:36, enh wrote:
>> Adding flushing to xputs() is an Elliott question is because he's the one who
>> can presumably keep stdio buffer semantics in his head. They make zero sense 
>> to
>> me. I added a flush to xputsn() because in line buffering mode it was the
>> "dangerous" one that had no newline thus no flush, but then when we go to 
>> block
>> buffering mode xputs() needs a flush just like xputsn() would, and MAYBE it's
>> good to have the flush because in line buffer mode it would be a NOP? Except 
>> the
>> command selected block buffering mode precisely BECAUSE it didn't want to 
>> flush
>> each line, so why should xputs() do it when the command asked not to? And if
>> xputs() is doing it, why is xprintf() _not_ doing it? And if xprintf() _is_
>> doing it, then we're back to basically not buffering the output...
> 
> this to me was exactly why it should be "everything flushes" or
> "nothing flushes". not "some subset of calls for some subset of
> inputs", because no-one can keep all the special cases in their heads.
> and "everything flushes" is problematic from a performance
> perspective, so "nothing flushes" wins by default. (but, yes, when we
> have our own kernel, have a time-based component to buffering layer's
> flushing is an interesting idea :-) )

Eh, now Yi-Yo's pointed me back at timer_create() and reminded me of realtime
signals, it seems like the plumbing is there to make FILE * output use nagle.

The problem is a userspace wrapper trying to fflush() from signal context
assumes everything's written in an async-safe way (and of course that everyone
else will SA_RESTART when interrupted), which either means spray it down with
thread locking or use cmpxchg() within the flush() implementation, and either
way involves me trusting libc in a way I currently don't. (And I can't do it
"right" myself due to FILE * internals being opaque, I have to wrap an unknown
implementation...)

But it sounds quite feasible for _libc_ to do setvbuf(_IONAG) these days. :)

(Sigh, or threads with signalfd(). Grrr.)

>> I like systematic solutions that mean the problem never comes up again. 
>> Elliott
>> has been advocating the whack-a-mole "fix them as we find them" approach here
>> which I am not comfortable with. I've been leaning towards adding a
>> TOYFLAG_NOBUF so we can select a third buffering type, and go back to "do not
>> buffer output at all, flush after every ANSI write function" for at least 
>> some
>> commands I'd like to be able to reason about. Especially for commands like 
>> this
>> where output performance really doesn't seem like it should be an issue.
> 
> +1 --- an inherently interactive program like this seems reasonable to
> be unbuffered. (except for file transfer?)

Isn't file transfer sending 4k blocks?

Buffering gets really weird with this kind of program anyway: when you're
sending data across a serial port that's breaking it up into individually
transmitted bytes, and depending on what your 16550a-or-similar threshold is set
to the recipient's probably getting notified of each group of 8 bytes. (And yes,
the hardware uses SOMETHING LIKE NAGLE internally to enforce the input and
output notification thresholds.)

Then you layer ppp over it, which breaks your 4k into something like 1.5k
chunks, then does nagle on that trying to fill out the last packet, and that's
assuming there were no pipes in there which do their own re-collating on the
data. And then of course files, once there's filesystem involvement... but of
course the VFS is in there before that marshalling data into and out of page
cache...

The point of the output buffer is to deal with chunks of data "big enough" to
amortize the transaction overhead. Zerocopy of the data has always been somewhat
aspirational, and handing off buffers between page table contexts is often more
expensive than copying it. (Or not! It changes between hardware generations and
I didn't even PRETEND to be current on how the "mitigations for cache
speculation side channel attacks" differ between different kernel versions
running on different arm processors...

There comes a point where "locality within process good, launch largeish buffer
out into the operating system, wave bye-bye as it goes off into the machinery"
is the best I can do. Although the definition of largeish still has the dregs of
moore's law clinging to it with the recent 4k->256k push. (xmodem had 128 byte
packets, I suppose it's roughly the same jump...)

>> https://lists.gnu.org/archive/html/coreutils/2024-03/msg00016.html
> 
> (fwiw, i think that was just some internet rando asking for it, no?
> and they didn't actually implement it?)

Padraig's reply was "this does seem like useful functionality" and a pointer to
the libc people, and then there were over a dozen additional replies in the
thread, so I wouldn't call it a clear no...

> do you run on anything that doesn't have real-time signals? i was
> going to say that -- since toybox is a closed world -- you 

Re: [Toybox] [PATCH] xputs: Do flush

2024-05-19 Thread Rob Landley
On 5/18/24 21:53, Yi-Yo Chiang wrote:
> What I wanted to address with this patch are:
> 
> 1. Fix this line of
> xputs() https://github.com/landley/toybox/blob/master/toys/net/microcom.c#L113
> The prompt text is not flushed immediately, so it is not shown to the user 
> until
> the escape char is entered (which defeats the purpose of the prompt, that is 
> to

I agree you've identified two problems (unflushed prompt, comment not matching
code) that both need to be fixed. I'm just unhappy with the solutions, and am
concerned about a larger design problem.

I implemented TOYFLAG_NOBUF and applied it to this command. The result compiles
but I'm not near serial hardware at the moment, does it fix the problem for you?

Trying to fix it via micromanagement (adding more flushing and switching some
but not all output contexts in the same command between FILE * and file
descriptor) makes my head hurt...

Adding flushing to xputs() is an Elliott question is because he's the one who
can presumably keep stdio buffer semantics in his head. They make zero sense to
me. I added a flush to xputsn() because in line buffering mode it was the
"dangerous" one that had no newline thus no flush, but then when we go to block
buffering mode xputs() needs a flush just like xputsn() would, and MAYBE it's
good to have the flush because in line buffer mode it would be a NOP? Except the
command selected block buffering mode precisely BECAUSE it didn't want to flush
each line, so why should xputs() do it when the command asked not to? And if
xputs() is doing it, why is xprintf() _not_ doing it? And if xprintf() _is_
doing it, then we're back to basically not buffering the output...

> tell the user what the escape char is) and stdout is flushed by handle_esc.
> To fix this we either make xputs() flush automatically, or we just add a 
> single
> line of manual flush() after xputs() in microcom.c.
> Either is fine with me.

When I searched for the first xputs in microcom I got:

  xputsn("\r\n[b]reak, [p]aste file, [q]uit: ");
  if (read(0, , 1)<1 || input == CTRL('D') || input == 'q') {

Which is a separate function (the n version is no newline, it does not add the
newline the way libc puts() traditionally does), with its own flushing
semantics: xputsn() doesn't call xputs(), and neither calls or is called by
xprintf(). "Some functions flush, some functions don't" is a bit of a design
sharp edge.

The larger problem is manual flushing of the output buffer is a terrible
interface, and leads to missing error checking without which a command won't
reliably exit when its output terminal closes because the whole SIGPIPE thing
was its own can of worms that even bionic used to manually mess with. Which is
why I originally made toybox not ever do that (systemic fix) but I got
complaints about performance.

Your other patch changes a bunch of xprintf() to dprintf() which is even _more_
fun because dprintf() writes directly to the file descriptor (bypassing the
buffer in the libc global FILE * instance "stdio"), which means in the absence
of manual flushing the dprintf() data will go out first and then the stdio data
will go out sometime later, in the wrong order. Mixing the two output formats
tends to invert the order that way, unless you disable output buffering.

I like systematic solutions that mean the problem never comes up again. Elliott
has been advocating the whack-a-mole "fix them as we find them" approach here
which I am not comfortable with. I've been leaning towards adding a
TOYFLAG_NOBUF so we can select a third buffering type, and go back to "do not
buffer output at all, flush after every ANSI write function" for at least some
commands I'd like to be able to reason about. Especially for commands like this
where output performance really doesn't seem like it should be an issue.

And OTHER implementations can't consistently get this right, which is why 
whether:

  for i in {1..100}; do echo -n .; sleep .1; done | less

Produces any output before 10 seconds have elapsed is potluck, and varies from
release to release of the same distro.

Oh, and the gnu/crazies just came up with a fourth category of write as a
gnu/extension: flush after NUL byte.

https://lists.gnu.org/archive/html/coreutils/2024-03/msg00016.html

It's very gnu to fix "this is too complicated to be reliable" by adding MORE
complication. Note how the problem WE hit here was 1) we didn't ask for LINEBUF
mode, 2) \r doesn't count as a line for buffer flushing purposes anyway, 3) the
new feature making it trigger on NUL instead _still_ wouldn't make \r count as a
line for buffer flushing purposes.

My suggestion for a "proper fix" to the problem _category_ of small writes being
too expensive was to have either libc or the kernel use nagle's algorithm for
writes from userspace, like it does for network connections. (There was a fix to
this category of issue decades ago, it just never got applied HERE.)

But that hasn't been popular, and it's a pain to implement in 

Re: [Toybox] [PATCH] xputs: Do flush

2024-05-18 Thread Rob Landley
On 5/16/24 06:46, Yi-Yo Chiang via Toybox wrote:
> The comment string claims xputs() to write, flush and check error.
> However the 'flush' operation is actually missing due to 3e0e8c6
> changing the default buffering mode from 'line' to 'block'.

That's sort of an Elliott question?

Originally, xprintf() and friends all flushed (which is necessary to detect
output errors and xexit() if so), but Elliott complained that was too slow, so
the flushes got removed, and then we changed the default stdout buffering type,
and...

Alas, it was a whole multi-year thing. Elliott has volunteered to put manual
flushes everywhere it's a problem. I've seriously thought about going
exclusively to file descriptor output (dprintf() is in posix now) and leaving
FILE * for input only.

Personally, I honestly believe the _proper_ fix is to upgrade the kernel to use
vdso to implement nagle's algorithm on file descriptor 1:

https://landley.net/notes-2024.html#28-04-2024

But I'm not holding my breath.

Rob

P.S. I should post some subset of
https://landley.net/bin/mkroot/latest/linux-patches/ to linux-kernel again. So
they can be ignored again.
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] netcat -f bug

2024-05-15 Thread Rob Landley
On 5/11/24 02:11, Yi-Yo Chiang wrote:
> On Sat, May 11, 2024 at 1:30 AM Rob Landley  <mailto:r...@landley.net>> wrote:
> 
> What's your use case triggering this patch? Because without that, I go 
> off on
> various design tangents, as seen below:
> 
> I just wanted some tool to communicate with a pty or socket node on android.
> Wanted a program to be able to send/recv towards a duplex data stream. (more
> precisely I want a command that does exactly what pollinate() does)
> Since socat nor minicom is available on Android, I'm just using `stty raw 
> -echo
> && nc -f` to "talk" to my pty.
> 
> Why didn't I use <> redirector? Because I wasn't aware of that feature before
> reading this mail...
> Let me fiddle with it a bit:
> 
> cat <>/dev/pts/0
>> Shows the pts output, but my input doesn't get passed back

Sorry for sitting on this, my confusion here is I don't know what /dev/pts/0
means in your test, and the pts man page isn't illuminating. It doesn't seem to
be special, it just seems to be the first one allocated? (So who allocated it on
android?)

According to "tty" in a random command line tab that one's using /dev/pts/17,
and ps ax | grep pts/0 says it's PID 14597 a random bash instance, so I don't
think the test lines up on a debian+xfce laptop.

What is your test trying to _do_? (What process are you talking to?)

> yeah like you said it should had fall through and be like -l. 
> However digging the git history the fall through line got removed
> here 
> https://github.com/landley/toybox/commit/67bf48c1cb3ed55249c27e6f02f5c938b20e027d
> which is unintentional I think?

Yeah, lack of automated regression testing for this, which is why I want to
understand and fix the test...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] strlower() bug

2024-05-14 Thread Rob Landley


On 5/14/24 12:12, enh wrote:
> On Tue, May 14, 2024 at 1:04 PM Rob Landley  wrote:
>>
>> On 5/14/24 07:10, enh wrote:
>> > macOS tests seem to be broken since this commit?
>> >
>> > FAIL: find strlower edge case
>> > echo -ne '' | touch aⱥ; find . -iname aȺ
>> > --- expected 2024-05-10 17:32:56.0 +
>> > +++ actual 2024-05-10 17:32:56.0 +
>> > @@ -1 +0,0 @@
>> > -./aⱥ
>>
>> Sigh. Apple's handling of utf8/unicode continues to be... "a challenge".
>>
>> When I run "make test_find" standalone, it gives me:
>>
>> scripts/runtest.sh: line 219: syntax error near unexpected token `;'
>> scripts/runtest.sh: line 219: `  R) LEN=0; B=1; ;&'
>>
>> Because bash 3.2 from 2007 doesn't understand ;&
> 
> yeah, nor does mksh. it hasn't caused me any problems though; i've
> been ignoring it for years now.
> 
>> And THEN it goes:
>>
>> touch: out of range or illegal time specification: 
>> -MM-DDThh:mm:SS[.frac][tz]
>> touch: out of range or illegal time specification: 
>> -MM-DDThh:mm:SS[.frac][tz]
>> FAIL: find newerat
>> echo -ne '' | find dir -type f -newerat @12345
>> --- expected2024-05-14 11:16:40.0 -0500
>> +++ actual  2024-05-14 11:16:40.0 -0500
>> @@ -1 +0,0 @@
>> -dir/two
>>
>> Which is a different error that DOESN'T happen with the global tests, because
>> those are using toybox touch rather than homebrew's $TOUCH. But it works on
>> debian. Let's see:
>>
>> $ touch --version
>> touch: illegal option -- -
>> usage: touch [-A [-][[hh]mm]SS] [-achm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]]
>>[-d -MM-DDThh:mm:SS[.frac][tz]] file ...
>>
>> Thank you, gnu project. I'm gonna assume this is _also_ from 2007. (I made
>> scripts/prereq/build.sh for a REASON...)
> 
> no, i think this is a BSD touch.
> 
> yeah, that looks very like the FreeBSD touch's usage:
> 
> static void
> usage(const char *myname)
> {
> fprintf(stderr, "usage: %s [-A [-][[hh]mm]SS] [-achm] [-r file] "
> "[-t [[CC]YY]MMDDhhmm[.SS]]\n"
> "   [-d -MM-DDThh:mm:SS[.frac][tz]] "
> "file ...\n", myname);
> exit(1);
> }
> 
> 
>> Then when I run "make clean macos_defconfig tests" I get:
>>
>> Undefined symbols for architecture arm64:
>>   "_iconv", referenced from:
>>   _do_iconv in iconv.o
>>  (maybe you meant: _iconv_main)
>>   "_iconv_open", referenced from:
>>   _iconv_main in iconv.o
>> ld: symbol(s) not found for architecture arm64
>>
>> Because the Makefile has:
>>
>> tests: ASAN=1
>> tests: toybox
>> scripts/test.sh
>>
>> And ASAN apparently breaks on homebrew's toolchain but not debian's 
>> toolchain.
>> Why does it break there but not on Linux...
>>
>> probe cc -Wall -Wundef -Werror=implicit-function-declaration
>> -Wno-char-subscripts -Wno-pointer-sign -funsigned-char
>> -Wno-deprecated-declarations -Wno-string-plus-int 
>> -Wno-invalid-source-encoding
>> -fsanitize=address -O1 -g -fno-omit-frame-pointer -fno-optimize-sibling-calls
>> -xc -o /dev/null -
>> error: cannot parse the debug map for '/dev/null': The file was not 
>> recognized
>> as a valid object file
>> clang: error: dsymutil command failed with exit code 1 (use -v to see 
>> invocation)
>>
>> Because it tries to read back the -o output we discarded, and fails when it
>> can't do so, thus all library probes fail and it tries to build with no
>> libraries. But only when ASAN is enabled, because ASAN uses -o as INPUT. 
>> Bravo.
>>
>> None of this is the actual unicode failure, this is just ambient macos...

FAIL: find strlower edge case
echo -ne '' | touch aⱥ; find . -iname aȺ
--- expected2024-05-14 13:32:19.0 -0500
+++ actual  2024-05-14 13:32:19.0 -0500
@@ -1 +0,0 @@
-./aⱥ
make: *** [tests] Error 1
cfarm104 (homebrew):toybox landley$ ls generated/testdir/testdir/
a?
$ LC_ALL=en_US.UTF-8 ls generated/testdir/testdir
a?
$ generated/testdir/ls generated/testdir/testdir
a\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245\342\261\245
$ echo -./aⱥ
-./aⱥ
$ generated/testdir/ls -N generated/testdir/testdir
aⱥ
cfarm104 (homebrew):toybox landley$ generated/testdir/ls -N
g

Re: [Toybox] strlower() bug

2024-05-14 Thread Rob Landley
On 5/14/24 07:10, enh wrote:
> macOS tests seem to be broken since this commit?
> 
> FAIL: find strlower edge case
> echo -ne '' | touch aⱥ; find . -iname aȺ
> --- expected 2024-05-10 17:32:56.0 +
> +++ actual 2024-05-10 17:32:56.0 +
> @@ -1 +0,0 @@
> -./aⱥ

Sigh. Apple's handling of utf8/unicode continues to be... "a challenge".

When I run "make test_find" standalone, it gives me:

scripts/runtest.sh: line 219: syntax error near unexpected token `;'
scripts/runtest.sh: line 219: `  R) LEN=0; B=1; ;&'

Because bash 3.2 from 2007 doesn't understand ;&

And THEN it goes:

touch: out of range or illegal time specification: 
-MM-DDThh:mm:SS[.frac][tz]
touch: out of range or illegal time specification: 
-MM-DDThh:mm:SS[.frac][tz]
FAIL: find newerat
echo -ne '' | find dir -type f -newerat @12345
--- expected2024-05-14 11:16:40.0 -0500
+++ actual  2024-05-14 11:16:40.0 -0500
@@ -1 +0,0 @@
-dir/two

Which is a different error that DOESN'T happen with the global tests, because
those are using toybox touch rather than homebrew's $TOUCH. But it works on
debian. Let's see:

$ touch --version
touch: illegal option -- -
usage: touch [-A [-][[hh]mm]SS] [-achm] [-r file] [-t [[CC]YY]MMDDhhmm[.SS]]
   [-d -MM-DDThh:mm:SS[.frac][tz]] file ...

Thank you, gnu project. I'm gonna assume this is _also_ from 2007. (I made
scripts/prereq/build.sh for a REASON...)

Then when I run "make clean macos_defconfig tests" I get:

Undefined symbols for architecture arm64:
  "_iconv", referenced from:
  _do_iconv in iconv.o
 (maybe you meant: _iconv_main)
  "_iconv_open", referenced from:
  _iconv_main in iconv.o
ld: symbol(s) not found for architecture arm64

Because the Makefile has:

tests: ASAN=1
tests: toybox
scripts/test.sh

And ASAN apparently breaks on homebrew's toolchain but not debian's toolchain.
Why does it break there but not on Linux...

probe cc -Wall -Wundef -Werror=implicit-function-declaration
-Wno-char-subscripts -Wno-pointer-sign -funsigned-char
-Wno-deprecated-declarations -Wno-string-plus-int -Wno-invalid-source-encoding
-fsanitize=address -O1 -g -fno-omit-frame-pointer -fno-optimize-sibling-calls
-xc -o /dev/null -
error: cannot parse the debug map for '/dev/null': The file was not recognized
as a valid object file
clang: error: dsymutil command failed with exit code 1 (use -v to see 
invocation)

Because it tries to read back the -o output we discarded, and fails when it
can't do so, thus all library probes fail and it tries to build with no
libraries. But only when ASAN is enabled, because ASAN uses -o as INPUT. Bravo.

None of this is the actual unicode failure, this is just ambient macos...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] I'm aware landley.net is saying "site.not found".

2024-05-13 Thread Rob Landley
That dreamhost server migration they did? (The recent "only 2 years old" version
thread?) Does not seem to have correctly updated the DNS record. Of the domain
they manage for me.

Dreamhost continues to provide nine fives of uptime. I've pinged support
already, they'll probably get back to me in the morning.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] today in "shut up, gnu!"

2024-05-12 Thread Rob Landley
On 4/12/24 13:24, enh via Toybox wrote:
> ~/aosp-main-with-phones$ find external/ -name NOTICE -type l -maxdepth 2
> find: warning: you have specified the global option -maxdepth after
> the argument -name, but global options are not positional, i.e.,
> -maxdepth affects tests specified before it as well as those specified
> after it.  Please specify global options before other arguments.

Looking back at this (ok, closing tabs), I think I implemented this the same as
any other option, so you can "-type l -o -maxdepth 2" and friends. The thing is,
when maxdepth triggers it returns without recursing on the path being evaluated,
so you'd have to "-type d -o maxdepth 2" for the difference to matter.
(Recursing into lower entries but not triggering on them.)

But it's not "global" in any magic way. It's just... another option? Which
doesn't seem WRONG... And of course
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html hasn't got
maxdepth.

Meanwhile, busybox has:

//config:config FEATURE_FIND_MAXDEPTH
//config:   bool "Enable -mindepth N and -maxdepth N"
//config:   default y
//config:   depends on FIND

And:

#define INIT_G() do { \
setup_common_bufsiz(); \
BUILD_BUG_ON(sizeof(G) > COMMON_BUFSIZE); \
/* we have to zero it out because of NOEXEC */ \
memset(, 0, sizeof(G)); \
IF_FEATURE_FIND_MAXDEPTH(G.minmaxdepth[1] = INT_MAX;) \
IF_FEATURE_FIND_EXEC_PLUS(G.max_argv_len = bb_arg_max() - 2048;) \
G.need_print = 1; \
G.recurse_flags = ACTION_RECURSE; \
} while (0)

And I miss the days when I worked on that project and it was SIMPLE. I liked
simple. That's what attracted me to it in the first place...

https://git.busybox.net/busybox/commit/?id=053c12e0de30

Yeah, I'm not even trying to understand that right now. I'll take my 730 lines
over their 1750 lines any day, I don't CARE who has the smaller binary size
after stripping specific ELF table entries...

Anyway, I should come up with a test for maxdepth acting as a normal option vs
acting as a global option...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] nproc(1)

2024-05-11 Thread Rob Landley
Relevant blog entry is https://landley.net/notes-2022.html#26-07-2022

> Meanwhile, I found out that musl has a bug! The nproc command has two
> modes, the default shows available processors (as modified by taskset),
> and nproc --all shows installed processors (whether or not the current process
> can schedule on them). One codepath is _SC_NPROCESSORS_CONF and the other
> is _SC_NPROCESSORS_ONLN. Except musl does ONLN for both, it hasn't got the
> second codepath, which according to strace is checking /sys/devices/system/cpu
> in glibc, and the bionic source has a comment saying that /proc/cpuinfo
> works fine on x86 but arm is broken because arm filters out the
> taskset-unavailable processors from that, so you have to look at the sysfs
> one to work around the arm bug.

And then me ruminating that mkroot is all single processor emulations so testing
this is once again a design issue...

Pretty sure I poked Rich about it at the time, but I just I confirmed that musl
still has the bug in today's git. And the above bionic note is apparently why my
code is looking at sysfs to get the data, and "strace nproc --all" on debian
says that's what they're doing too, and ltrace says it's doing the getconf() so
yes glibc is also doing it.

Musl will apparently allow itself to read data out of /proc, or at least there's
13 hits in the current codebase, but has zero instances of reading out of /sys.

Rob

On 5/2/24 11:20, enh wrote:
> (to be fair, i was shocked the first time i had to deal with an
> Android device where these weren't both the same...)
> 
> On Thu, May 2, 2024 at 9:18 AM enh  wrote:
>>
>> /facepalm
>>
>> maybe move your hand-written version into portability just for musl,
>> and everyone with a working libc just uses sysconf()?
>>
>> On Tue, Apr 30, 2024 at 8:26 PM Rob Landley  wrote:
>> >
>> > On 4/29/24 16:56, enh via Toybox wrote:
>> > > isn't nproc(1) just a call to sysconf(3) with either
>> > > _SC_NPROCESSORS_ONLN for regular behavior, or _SC_NPROCESSORS_CONF for
>> > > --all?
>> >
>> > From musl src/conf/sysconf.c:
>> >
>> > case JT_NPROCESSORS_CONF & 255:
>> > case JT_NPROCESSORS_ONLN & 255: ;
>> > unsigned char set[128] = {1};
>> > int i, cnt;
>> > __syscall(SYS_sched_getaffinity, 0, sizeof set, set);
>> > for (i=cnt=0; i> > for (; set[i]; set[i]&=set[i]-1, cnt++);
>> > return cnt;
>> >
>> > Musl returns the same thing for "conf" and "online".
>> >
>> > Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] stty bug

2024-05-11 Thread Rob Landley
On 5/10/24 06:15, Yi-Yo Chiang via Toybox wrote:
> The _negate & combination_ type of settings are bugged.
> 
> `stty cooked` and `stty raw` works fine, but the negated options:
> 
> $ stty -raw
> stty: unknown option: cooked
> $ stty -cooked
> stty: unknown option: raw

Ack, added to the notes for that command. (It's in pending for a reason...)

Thanks,

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] unshare/nsenter and flags

2024-05-11 Thread Rob Landley
On 5/10/24 18:46, Yifan Hong wrote:
> I am running all commands as a non-root user. Here are the two commands I run:
> 
> strace ./toybox unshare --mount --map-root-user --user /bin/bash -c 'echo' 
> 2>&1
> | tee /tmp/user.txt
> strace ./toybox unshare --mount --map-root-user /bin/bash -c 'echo' 2>&1 | tee
> /tmp/no_user.txt
> strace unshare --mount --map-root-user /bin/bash -c 'echo' 2>&1 | tee
> /tmp/no_user_linux.txt

$ unshare --mount --map-root-user --user /bin/bash -c echo
unshare: unshare failed: Operation not permitted

That's on my host devuan. Let's see about newer...

Ah, booting a daedalus ISO under KVM, the command works. Looks like they added
(enabled?) new kernel plumbing between 3.0 and 5.0.

> Got about half my laptop tabs closed so far! Working towards a reboot...

Ok, time to bite the bullet and finish that, if I need the upgrade to test a 
fix...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] strlower() bug

2024-05-10 Thread Rob Landley
On 5/8/24 16:27, Ray Gardner wrote:
> BTW I was a bit surprised that mentioning my awk for toybox got no reaction.

Oh I'm interested, but somebody (probably you) mentioned they were looking into
it before, and I'll wait to see some code first. :)

(The problem with asking to see code early is pending/git.c isn't useful and
that's as far as the original developer had time/energy for, and I haven't
personally opened that can of worms y et. The problem with waiting until it's
done is pending/bc.c was several times larger than I expected and I'm uncertain
if I even want to personally open that can of worms.)

That said, if you're actively working on it and wanted to do a brief design
infodump here, consider it solicited. :)

Rob

P.S. Also, my old Austin house finally went on the market last weekend and we
got a lowball bid two days later that the realtor was doing the GO GO GO DON'T
STOP TO THINK ACT NOW SUPPLIES RUNNING OUT pressure thing you see in most scams,
because apparently houses and bananas have a similar lifespan and if it's on the
market for longer than it takes anyone outside the realtor's immediate friend
circle to find out about it the world will end. So now there's all the paperwork
in the world. Fade and I had to get a printout notarized on wednesday. They're
asking when we last had all the plumbing and wiring in the walls replaced. (Is
that a thing people regularly do after houses are built?) I had to docusign a
leaded paint affidavit addendum. The old bank that holds the mortage we're
paying off called me to try to upsell me on a NEW mortgage (we're renting for
now), and the person wanting to "transfer me to an agent" wouldn't get off the
phone for 20 minutes. And Wells Fargo got our addresses updated so it says our
checking account type is changing to one that charges us a $10 fee/month for
existing. Anyway, if I seem a bit distracted right now it's because I am.
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] netcat -f bug

2024-05-10 Thread Rob Landley
What's your use case triggering this patch? Because without that, I go off on
various design tangents, as seen below:

On 5/10/24 06:09, Yi-Yo Chiang via Toybox wrote:
> Hi,
> The -f option for netcat doesn't seem to be doing anything right now.

I should have a test for that, but to be honest I came up with netcat -f back in
busybox (commit 1cca9484db69 says 2006) before I knew about bash's <> redirector
to open a file for both reading _and_ writing (or had bash not added it yet?),
meaning the example in that commit probably _should_ have been stty 115200 -F
/dev/ttyS0 && stty raw -echo -ctlecho && cat <>/dev/ttyS0 >&0 2>&0

(I should NOT ask Chet for "{0-2}<>/dev/ttyS0" syntax operating on a filehandle
range. I should not do it. That would be... I dunno, rude? I mean in theory I'd
just want him to fix the existing {1..2} syntax to do one open() and then dup()
redirects instead of opening the device multiple times, which was the initial
problem because reopening the /dev node instead of dup() an existing filehandle
to it either gave -EBUSY or hardware reset the UART depending on the underlying
driver, and the reason chet would give me a LOOK if I asked is {brace,expansion}
is resolved _before_ variable expansion and redirection, so it literally turns
INTO 3 arguments with different numbers and thus three separate open() calls to
the char device, and making it do something else is basically a layering
violation...)

Ahem. Sorry. Tangent.

It's possible netcat -ft makes it still useful, but A) that implies there should
be some sort of tty wrapper in the nice/taskset/time/chroot/nohup mold, B) I
think -t is currently broken because I needed to rewrite it to add nommu support
(decompose forkpty() into the underlying openpty() and login_tty() calls around
the vfork() instead of fork()) and just commented it out and put it on the todo
list...

The original theory was -f should fall through to the "else" case on line 191,
and thus naturally inherit any other applicable options. Which is hard to see in
my current tree because with a bunch of half-finished work in it:

$ git diff toys/*/netcat.c | diffstat
 netcat.c |   62 +-
 1 file changed, 49 insertions(+), 13 deletions(-)

Sorry for falling behind...

> It is
> missing a call to pollinate() after opening the specified device file.
> The patch adds back that line of pollinate().

Which makes it not work with running commands (ala -f should work like -l).

> Also make sure that the timeout handler is not armed for -f mode as -f 
> shouldn't
> timeout. File open() should just succeed or fail immediately.

Why shouldn't -f timeout? Various /dev nodes take a while to open, automount
behind the scenes... Is there a downside to leaving that part as is? (Other than
the new case you added not alarm(0) disarming it?)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] unshare/nsenter and flags

2024-05-10 Thread Rob Landley
Ok, cycling back to this...

On 5/2/24 21:51, enh wrote:
>> > it seems like -r _doesn't_ actually imply -U in practice (and they
>> > seemed to have strace output to prove it).
>>
>> So... should it?
> 
> i think so? i have no idea about any of this, but
> https://man7.org/linux/man-pages/man1/unshare.1.html says
> 
>-r, --map-root-user
>Run the program only after the current effective user and
>group IDs have been mapped to the superuser UID and GID in
>the newly created user namespace. This makes it possible to
>conveniently gain capabilities needed to manage various
>aspects of the newly created namespaces (such as configuring
>interfaces in the network namespace or mounting filesystems
>in the mount namespace) even when run unprivileged. As a mere
>convenience feature, it does not support more sophisticated
>use cases, such as mapping multiple ranges of UIDs and GIDs.
>This option implies --setgroups=deny and --user. This option
>is equivalent to --map-user=0 --map-group=0.
> 
> which sounds like it supports the toybox documentation rather than the
> toybox source?
> 
>> What did they try to do, and what did they _want_ to happen?
> 
> unshare --mount --map-root-user /bin/sh -c "mount --bind $A $B"

Running that as my normal user gave EPERM on the unshare(CLONE_NEWNS) which is
the reason I haven't poked at this more. (To be useful, it seems like it
probably needs to be setuid and then drop permissions after unsharing stuff, and
I need to come up to speed on the security implications of that and possibly
write a "contain" command with as little novelty as possible. Which is not a can
of worms I want to open without a clear desk...)

Running it under sudo I got:

openat(AT_FDCWD, "/proc/self/setgroups", O_WRONLY) = 3
write(3, "deny", 4) = -1 EPERM (Operation not permitted)

> they looked at strace for toybox and saw
> 
> unshare(CLONE_NEWNS)= -1 EPERM (Operation not permitted)
> 
> but for the util-linux one they saw
> 
> unshare(CLONE_NEWNS|CLONE_NEWUSER)  = 0

Are they root or a normal user? Because adding -U to the above command line I 
got:

geteuid()   = 1000
getegid()   = 1000
unshare(CLONE_NEWNS|CLONE_NEWUSER)  = -1 EPERM (Operation not permitted)

But with sudo, that succeeded and adding an ls -l to the bash command yes it did
the bind mount, which is gone again when it exits.

>> The "22.04" means it came out two years and one month ago, and that's what
>> they're migrating me TO. So, you know, I can presumably feel less bad about 
>> my
>> laptop...
> 
> (to be fair, until _last week_ that was the current LTS release :-)
> but, yeah, odd timing unless they deliberately like to be on the
> previous LTS release! i'll throw no stones as long as i'm living so
> close to the Android build server glass house though...)

Got about half my laptop tabs closed so far! Working towards a reboot...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] strlower() bug

2024-05-08 Thread Rob Landley
On 5/6/24 17:12, Ray Gardner wrote:
> While working on an awk implementation for toybox, I found a bug in
> strlower(), which is used only in find.c. I've attached some tests to
> put in find.test to reveal it. I can't put them here directly because
> I don't think the UTF-8 names will come through. (I modelled my awk
> tolower()/toupper() code on your strlower().)

Your test doesn't create the files you're finding, so find is supposed to fail?
Your first test doesn't barf under ASAN, and then the second one's going to fail
because echo -n | wc says it's 258 bytes and the VFS file length limit is 255
bytes, so there CAN'T be a file named that on Linux. (Path length != path
component length, there's no slashes in there.)

> The problem is in the test if the output string needs to be enlarged
> to take an expanded lowercase:
> // Case conversion can expand utf8 representation, but with extra mlen
> // space above we should basically never need to realloc
> if (mlen+4 > (len = new-try)) continue;
> 
> The mlen+4 needs to be mlen-4 to leave at least 4 bytes for the next 
> character.

Hmmm, possibly. I still don't understand what your test case is testing. (Just
trying to trigger an ASAN violation with an otherwise nonsense test?)

> As the comment indicates, it should "never" need to realloc;

No, the first comment is "never" because triggering probably indicates a libc
bug (we converted it from valid utf8 to a unicode code point, ran it through
libc's towlower(), and are now trying to convert the result _back_ to utf8, an
encoding hiccup at this point seems unlikely? But I don't trust locale plumbing
ever, so...)

The second is "basically never" because it requires an insane input string, but
that's user controlled and users do crazy things, sometimes even intentionally.

> it takes
> a very long name of uppercase characters that do expand when made
> lowercase. But the code is there to handle that very case.

The first malloc rounds the allocation up to next 8 byte boundary _after_ what
it's actually using, so 9-16 bytes of zeroes at the end, and assuming the
conversion only ever grows 1 byte (I don't remember the pathological expansion
case, it's in my blog somewhere, but your test is turning c8 ba into e2 b1 a5
which is 1 byte of expansion) then you need at least 8 expanding unicode code
points to burn through the padding, so your first test string is too short to
trigger a problem. And your second is too long to produce a valid filename, so
the test can't _succeed_...

Sigh, lemme come up with a test that demonstrates the fix working... the minimal
one seems to be ./find . -iname aȺ

And then, of course, TEST_HOST fails because I need to enable a utf8 locale, but
I made plumbing for that recent-ish-ly...

commit 6800a95ef328

> BTW, when I run those tests, they "PASS", but show as aborted:
> corrupted size vs. prev_size
> scripts/runtest.sh: line 137: 265983 Aborted find .
> -iname 
> AC
> PASS: find utf8 uppercase long name

Odd.

> The test echos and checks the $? return code and the abort apparently
> leaves that as 0.

That could be anything from a bash issue to your distro's libc. The only trap in
tests.sh is for SIGINT, and that handler isn't inherited by child processes. The
return code of a process killed by a signal should be 128+signum, which the test
plumbing would notice if it was the actual exit code of your shell snippet.

I checked in a test that should actually succeed, but would fail with ASAN
enabled before the bug was fixed.

> Is there a way to fix the test system so it can
> force the exit code to be something else?

Not if the signal/exit isn't allowed to propagate back to it by the test. You
ran a child process and then unconditionally did an ;echo $? meaning test.sh
doesn't get notified of the child process getting killed by a signal, it
unconditionally (because ;) went on to run a second command, "echo" which is
returning whatever your bash recorded.

Some distros have horrible fault interceptors that log crap into syslog or dmesg
or some such, AND THEN RETURN SUCCESS. (Which is doubly insane: A) a program
faulting does not need to be globally logged on a development system, B)
returning success when that happens is very sad, but their "logic" was that some
scripts would otherwise misbehave.)

> When I run the test from a
> command line directly in bash, it gets a code of 134 (SIGABRT).

Without ASAN I'm getting 139 (128+11 = SIGSEGV). There would appear to be a
difference in our environments.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [mkroot] Cannot Overwrite non-directory "$ROOT/bin/" with directory "[Path to overlay]"

2024-05-07 Thread Rob Landley
On 5/7/24 15:50, Rob Landley wrote:
> And THAT was based on the old environment setup I used to do in Firmware Linux
> to give User Mode Linux a mostly writeable chroot despite starting with
> https://user-mode-linux.sourceforge.net/hostfs.html but that was back before 
> git
> was invented so I just have a bunch of tarball snapshots over the years (at
> https://landley.net/aboriginal/downloads/old/) rather than

A) Sorry, forgot to explain,

B) That's not even the old one I'm talking about,
https://landley.net/aboriginal/old/download/snapshots probably is.

User Mode Linux is a port of Linux to userspace, I.E. making the "vmlinux" ELF
file built at the top of the tree an actual runnable Linux program, which boots
its own little VM and runs processes inside it. This predated QEMU or KVM by a
decade, and was one of the first ways to run a virtual Linux system without
requiring root access on the host. Firmware Linux was built around it,
Aboriginal Linux was the relaunch targeting QEMU instead (and doing cross
compiling, because UML only ever properly supported x86 for some reason).

UML had the "hostfs" filesystem, which acted like a network filesystem making a
directory from the host appear in a directory of the virtual system. (Again,
decades before virtfs and friends, although NFS and Samba were around.)

The problem was, a hostfs file belonging to root (UID 0) wasn't writeable to
root within the VM. The mount point was SORT of writeable, but it was getting
translated on the host to reads/writes/renames/deletes as the host user running
UML, and then the translated syscall would fail and failures that shouldn't
happen were getting returned on the client. And this included fixups you needed
to do like replacing /etc/mtab with a symlink to /proc/mounts (because mount
points became a per-process attribute in Linux 2.5, so a single global mount
table as a filesystem maintained by the userspace mount tool didn't cut it 
anymore).

So I made a script that created a new directory in the host user's fully
writeable space and populated it with symlinks to host resources before
chrooting into it (all within UML), so I had access to the host stuff I needed
but could also replace it all as needed. And that's what I did my emulated Linux
>From Scratch build in, back around 2004.

Anyway, "here's a thing that needs to be spliced into the $PATH, you may want to
use symlinks" sometimes goes "whoosh" over my head as "hard for people who
haven't done it before" because to ME it's a 20 year old trick. Sorry 'bout 
that...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [mkroot] Cannot Overwrite non-directory "$ROOT/bin/" with directory "[Path to overlay]"

2024-05-07 Thread Rob Landley
On 5/6/24 23:33, Oliver Webb wrote:
> On Sunday, May 5th, 2024 at 21:21, Rob Landley  wrote:
> 
>> Oh, the other todo item here is "multiple overlays". The current overlay 
>> package
>> was a quick hack, never did the design work to figure out what what more
>> complication should look like. Partly waiting for people to complain to me 
>> that
>> they need more than it does...
> 
> Maybe making the OVERLAY variable a delimiter separated list, looping over
> it each time the overlay package is specified. Then indexing the OVERLAY 
> variable like a
> array with that counter (I don't really know how bash arrays work, I think 
> this is easy
> with them though from my vague knowladge, although I don't know)?

Various things are easy to implement, the question is what user interface works
best. The rest of mkroot uses CSV internally so having CSV in an option isn't
that heavy a lift. (Although it hasn't been presented as external UI before, and
relative vs absolute paths in a comma separated list is a bit tricksy, and we
never DID address "what happens if you define the same variable twice on the
command line? Right now it overwrites...)

The other sharp edge is "when files conflict between overlays, do you overwrite
or leave the old one or what".

And of course the "following symlinks out of tree" problem, which I added a tar
option to address and what I've vaguely thought of doing here was having toybox
tar handle it doing the tar c | tar xv trick with --restrict to just leverage
the existing stuff.

And I have a note about sparse file handling, which is at least 3 todo items
combined into one note:

1) cpio sparse handling is part of the periodic https://lwn.net/Articles/789228/
threads that never resolved last I checked, the last attempt wound up diverging
into https://lkml.iu.edu/hypermail/linux/kernel/2207.3/06939.html which
eventually went upstream (after it got completely rewritten to not smell like me
so Greg KH would tolerate it) but the cpio extension part didn't get brought
back up that I've been cc'd on...

2) tar sparse handling should have both modes (SEEK_HOLE and detect runs of
zeroes), and then the tar.test stuff updated to mostly use the runs of zeroes
because there are some TERRIBLE FILESYSTEM implementations out there and none of
them seem to agree on span granularity. (How big IS the run of zeroes? Where are
the edges? Just seek past 4k aligned blocks isn't good enough, and it doesn't
look like 64k is either. Don't get me started on "ecryptfs"...

3) add sparse support to cp.c. (Grumble grumble --sparse longopt without short
opt, and should --sparse=auto be the default behavior? If the filesystem doesn't
support sparse files then presumably seek-and-write will zero fill anyway and we
don't have to do anything. Or seek would fail, which I guess we should
gracefully handle but sendfile_pad() already has plumbing for that?)

>> It hasn't got "make". Kind of limiting factor not to have a make command on 
>> the
>> target.
> 
> gmake has a "./build.sh" that you can use to bootstrap it up on a system
> without make. My first step in this after I hacked together a overlay was
> to get a gmake tarball and try to build it with "./configure && ./build.sh && 
> ./make install",
> which configure (on host bash, not toysh) goes into a infinite loop without
> expr, and putting that in will fail because "host compiler does not produce
> run-able executable" (Which might be true because I have to manually hack 
> together
> a overlay each time and I throw out quick "hello world" tests mostly).

Good to know.

Way back when, I had a script that would splice the toolchain.sqf into the host
filesystem with a bunch of symlinks, ala
https://github.com/landley/aboriginal/blob/master/system-image.sh#L65 splicing
together
https://github.com/landley/aboriginal/blob/master/sources/toys/dev-environment.sh
and https://github.com/landley/aboriginal/blob/master/sources/toys/make-hdb.sh
although the interesting part was probably
https://github.com/landley/aboriginal/blob/master/sources/toys/dev-environment.sh#L72

And THAT was based on the old environment setup I used to do in Firmware Linux
to give User Mode Linux a mostly writeable chroot despite starting with
https://user-mode-linux.sourceforge.net/hostfs.html but that was back before git
was invented so I just have a bunch of tarball snapshots over the years (at
https://landley.net/aboriginal/downloads/old/) rather than

Pretty sure I have old blog entries at https://landley.livejournal.com
explaining what I was doing and why, but ever since the servers moved around I
haven't wanted to fish in them, archive.org is slow and has terrible UI, and my
backup disks from that period are... somewhere. Everything's still packed from
the move, I could 

Re: [Toybox] [mkroot] Cannot Overwrite non-directory "$ROOT/bin/" with directory "[Path to overlay]"

2024-05-05 Thread Rob Landley
On 4/27/24 20:44, Oliver Webb via Toybox wrote:
> Doing minimal linux system setup with mkroot and trying to create a minimal 
> environment
> with a native toolchain to run autoconf in. This would mean getting the 
> native static
> toolchain for my architecture from 
> https://landley.net/toybox/downloads/binaries/toolchains/latest/.
> Mounting the image (Why are cross compilers tarballs while native compilers 
> are fs images?

Copying the native compiler into the initramfs takes more space than initramfs
can comfortably hold. The run-qemu.sh in mkroot defaults to -m 256 (I.E. 256
megabytes system memory), and some board emulations (like mips) _can't_ map more
than that. (Making the boards consistent is good, it's enough to run a single
threaded compile, and it's nice for running lots of instances in parallel on the
host ala mkroot/testroot.sh.

Even ignoring that, the kernel's cpio extractor generally has its own size
limits. The initial physical memory layout only leaves so large a gap between
"where we loaded the cpio.gz" and "where we extract it to", and when you fill up
that gap at a certain point the extract overwrites the data it's reading,
because initramfs isn't _expected_ to be multiple gigabytes in size. Again, how
much you've got varies by target but adding a quarter gigabyte of toolchain
didn't work on multiple boards when I tried it.

Shrinking the toolchain down has some hard limits: even way back in the
aboriginal linux days when I was trying to set up a tinycc compiler on target,
just the extracted /usr/include headers took up quite a bit of space:

$ cd ccc
$ du -s i686-*cross/*/include
23148   i686-linux-musl-cross/i686-linux-musl/include

Currently 23 megabytes (and another couple megabytes for the compiler includes).
Keeping them in a squashfs was more memory efficient.

> Wouldn't making them tarballs mean that you could extract their contents 
> without running
> losetup and dealing with mounting devices and needing root permissions ?

Squashfs is an archive format, there's an unsquashfs command to extract it if
you want to fiddle with it on the host, although mount-and-copy in mkroot works 
too.

The problem (read-only) mounting a compressed archive is seekability: on normal
block devices the kernel can jump around and grab chunks of directory
information and file contents into dcache and page cache, and be free to discard
them again under memory pressure so they should be cheap to get back. That's the
design expectation for filesystems.

The problem with a tarball is you need to extract the whole thing starting at
the beginning to find where anything _is_. You can fix that by building an index
at mount time (extract the whole thing, examine the contents, and make notes)
but that makes mount really slow and also means you have a data tree you can't
discard so you've more or less pinned your directory cache if you want to know
where all the files start.

Zip file format addresses the dentry part because it was designed to let you
extract individual files, but it doesn't address seekability _within_ a file. If
you try seek 10 megs into a file (or mmap from that point) it has to extract and
discard 10 megs of data. (The main downside of zip files A) individually
compressing each file is less efficient than compressing the whole archive, so
they tend to be larger, B) zip puts all its metadata at the _end_ of the file,
so if the file is truncated at all you've lost ALL the contents because it
doesn't know what any of the rest means anymore. Incomplete zip file transfers
were worthless because it has to start reading at the end to find anything. The
reason it did that was so amending existing zip files in place was quick,
because it can remove and rewrite the metadata easily. If the metadata wasn't at
the end and needed to be expanded, it would either need to move all the file
contents to make room, or break the metadata into chunks and parse together
scattered overlays. Of course replacing a file in the archive wasted space
because unless the old file had coincidentally been at the very end of the
archive, it left the old one in there and just added the new copy and updated
the index to point at it.)

Most compression formats handle files in chunks: bzip2 does 900k blocks, gzip
does periodic dictionary resets, etc. Using a compression format with a
reasonable chunk size and tracking where each chunk starts lets you handle seeks
reasonably well, and that's what squashfs does. I haven't looked up the actual
file format, but conceptually it's a zip file plus chunk indexes within files.

> I trust they were
> made fs images for a good reason, but... _why_).

Within mkroot, squashfs is easier to deal with because I don't need to reserve
destination space to extract everything into to poke at the contents. Outside of
mkroot, squashfs isn't that much harder to play with, mostly just less familiar.

> And ideally running a mkroot overlay on
> it because that's what the overlays seem to be 

Re: [Toybox] Fw: Re: Dude.

2024-05-05 Thread Rob Landley
On 5/4/24 11:34, Oliver Webb via Toybox wrote:
> (Rob wants this on the list anyways, and he hasn't CC:-ed it.

If I want to send a message to the list, I'm capable of doing so.

As I said in the postscript I don't _object_ to it being on the list (in an "I
say the same things in public as in private" way), and I did lament that I'd
spent half a work day composing several thousand words to just one person rather
than as a general resource I could refer back to in future or maybe get a FAQ
entry out of.

But thinking about it after the fact (when I got your reply), I honestly didn't
expect other people on the list to be interested beyond maybe closure. It's
potentially useful to know that the guy who wrote about half of all messages to
the list last month (35/76 in the web archive) might stop.

>  I want it on the list for multiple reasons.

Or might not.

Reading lots of text is _work_. I reference "pascal's apology" a lot (him being
sorry for writing a long letter because he didn't have time to write a short
one), because people try to _read_ this stuff. (Or worse, they stop trying.) I
try to keep the signal to noise ratio up, and that means editing it DOWN. Which
takes time and energy. (This reply is uncomfortably rambly, but I've already
spent a day away from the keyboard going "I have to coherently reply to this"
and not wanting to.)

> (I gave him permission
> to cc it in a reply email I intend to forward to the list))

And I didn't, so you put words in my mouth again about what I "want".

This thread doesn't advance the project, and I doubt this exchange offers much
insight into _my_ behavior. I've been posting publicly for a quarter century on
linux-kernel and busybox and uclibc and toybox and j-core and elsewhere. I've
maintained _this_ project for 17 years. I've made policy statements about it in
design.html and the faq and on the list and in my blog (and twitter and mastodon
and livejournal and talks on youtube and mp3 recorded panels from penguicon and
linucon and heck, you can pull my old comments out of slashdot and lwn.net if
you try). There's even a code of conduct which HIGHLY IRONICALLY was originally
copied from twitter's. (No really:
https://github.com/landley/toybox/commit/bc308973ffb6) People already have
_plenty_ of rope to hang me with if they decide they need a reason.

I care about the _code_. what's best for the _project_. I also care about
documentation, but the problem is usually "too much" and needing to boil it down
and put it somewhere obvious where it's indexed and people can easily find it.

I'm trying NOT to make it about me. I'm very fiddly about the work, sometimes
trying (and failing) to do the programming equivalent of Faberge Eggs, and the
perfect can be the enemy of the good. But that's what distinguishes this project
from the half-dozen other implementations of the same stuff already out there.

That said, you pointed me at a message where you'd asked an actual question:

> > > because I've been trying to run gcc under mkroot and a response to
> > > http://lists.landley.net/pipermail/toybox-landley.net/2024-April/030334.html
> > > would've been helpful.
>
> Hadn't seen it. It got, quite literally, lost in the noise.

And I'd started replying to that (before you sent this to the list), and stopped
because it was too long and I needed to edit it down. I should just press "send"
on the ramble and move on to the next todo item. (Top of stack is fixing unshare
I think...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] unshare/nsenter and flags

2024-05-02 Thread Rob Landley
On 5/2/24 13:14, enh via Toybox wrote:
> another googler wanted a host unshare(1) for some testing... i added
> that, and they complained that although the docs say
> 
> -r Become root (map current euid/egid to 0/0, implies -U) 
> (--map-root-user)
> 
> it seems like -r _doesn't_ actually imply -U in practice (and they
> seemed to have strace output to prove it).

So... should it?

What did they try to do, and what did they _want_ to happen?

I'd compare with my debian unshare command but my install is a bit out of date.
(According to https://endoflife.date/devuan I've still got 4 weeks of support.)

Coincidentally, I just got an email yesterday morning from "The Happy Dreamhost
Upgrade Robot" (yes really) that they're updating landley.net's web container:

> We have great news! As part of our mission to support you with your digital
> presence, we are always looking to improve your products and provide you with
> the most advanced and powerful hardware.
> 
> On Wednesday, May 8th we will be migrating you to a newer shared server. As
> part of this maintenance, the operating system will be upgraded from Ubuntu
> Bionic to Ubuntu Jammy Jellyfish 22.04.2.
> 
> In most cases, no action is required on your part, but we've prepared some
> documentation that will help you prepare for the upgrade to Ubuntu Jammy:
> https://help.dreamhost.com/hc/en-us/articles/15506945971220

The "22.04" means it came out two years and one month ago, and that's what
they're migrating me TO. So, you know, I can presumably feel less bad about my
laptop...

> i was assuming the code was just missing, but when i looked, i found:
> 
> // unshare -U does not imply -r, so we cannot use [+rU]
> if (test_r()) toys.optflags |= FLAG_U;

Let's see, git annotate says that comment comes from commit 3c0be8a473c0:

Author: Samuel Holland 
Date:   Sun Apr 12 16:00:16 2015 -0500

unshare: fix -r

Calling unshare(2) immediately puts us in the new namespace
with the "overflow" user and group ID. By calling geteuid()
and getegid() in handle_r() after calling unshare(), we try
to map that to root, which Linux refuses to let us do.

What we really want to map to root is the caller's uid/gid
in the original namespace. So we have to save them before
calling unshare().

Meanwhile the "implies" in the help text comes from commit fb4a241f35cf two
months earlier:

Author: Rob Landley 
Date:   Wed Feb 18 15:19:15 2015 -0600

Patch from Isaac Dunham to add -r, fixed up so it doesn't
try to include two flag contexts simultaneously.

So it looks like Isaac made -r imply -U and Samuel made it _not_ do so, without
changing the help text, and I didn't notice because I'd really like to build
domain expertise here but haven't got it. (Largely because doing container stuff
tends to require root access, and if I'm requiring root access anyway I tend to
just chroot, or launch a qemu instance that does NOT require root access on the
host. It's on the todo list...)

I've used toybox's unshare command a bunch of times, but not the UID remapping
parts...

> but note the unshare/nsenter sharing there --- is it a problem that i
> have unshare enabled but not nsenter? is that expected to work?

I'm happy to implement proper semantics here if I know what they _are_. What
_should_ it do?

I recently blogged (https://landley.net/notes.html#13-04-2024) about attending
yet another container talk at txlf, but if I really want a "contain" command
what I should probably do is dig through:

  https://github.com/p8952/bocker
  https://github.com/Fewbytes/rubber-docker
  https://blog.lizzie.io/linux-containers-in-500-loc.html

And "come up with something". It would be really nice if there was a simple
existing syntax I could be compatible with, which is why I was vaguely looking
at what minijail does, and https://github.com/rkt/rkt and
https://github.com/opencontainers/runc and https://github.com/containers/crun
and https://github.com/containerd/containerd and so on.

But that's a fresh can of worms to open after I close a couple of existing ones,
and to get to 1.0 the LFS build needs "awk" more than container support...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] nproc(1)

2024-04-30 Thread Rob Landley
On 4/29/24 16:56, enh via Toybox wrote:
> isn't nproc(1) just a call to sysconf(3) with either
> _SC_NPROCESSORS_ONLN for regular behavior, or _SC_NPROCESSORS_CONF for
> --all?

>From musl src/conf/sysconf.c:

case JT_NPROCESSORS_CONF & 255:
case JT_NPROCESSORS_ONLN & 255: ;
unsigned char set[128] = {1};
int i, cnt;
__syscall(SYS_sched_getaffinity, 0, sizeof set, set);
for (i=cnt=0; ihttp://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] DreamHost Security Alert

2024-04-25 Thread Rob Landley
On 4/24/24 13:10, Rob Landley wrote:
> Alas, my website's likely to be down for a bit while I explain to them that 
> "the
> compiler that got used to build an exploit" and "the exploit" can share 
> strings
> because gnu is incompetent and leaks the path where things got built into the
> resulting binaries, but that does not mean that the compiler the strings came
> from in the first place is actually infected.

And it's back. Human saw the email thread at 9am and took reasonable action.

I was a little annoyed it was down all day, but eh: nine fives. Close enough.
They're cheap and I don't have to do it.

Rob

(Before them I had a server with a static IP where I ran all my own servers,
which meant I had one DNS server pointing to all the other services, and a
number of sites went "but DNS says you need TWO authoritative servers" and I
went "I'm not paying for a second static IP and all the records would point to
the first static IP so if it goes down what does being able to look up the name
of the services that aren't currently THERE accomplish? And that's before DNS
required cryptographic signatures, and then "sender permitted from" showed up in
email around then and NONE of those checkers would work without 2 DNS servers so
I _couldn't_ set it up... So yes I _could_ get one of my orange pi boards sent
to one of the raspberry pi hosting sites that give a static ipv4 as part of the
hosting package, but... I really don't want to?)
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] DreamHost Security Alert

2024-04-24 Thread Rob Landley
Alas, my website's likely to be down for a bit while I explain to them that "the
compiler that got used to build an exploit" and "the exploit" can share strings
because gnu is incompetent and leaks the path where things got built into the
resulting binaries, but that does not mean that the compiler the strings came
from in the first place is actually infected.

I mean, here's an article from 2018:

https://www.bleepingcomputer.com/news/security/mirai-iot-malware-uses-aboriginal-linux-to-target-multiple-platforms/

Rob

(I'd point to old blog entries where I went "huh, my compilers got used to build
random russian malware" ten years ago, but my blog was on my site so you
wouldn't see it unless I fish it out of archive.org...)

 Forwarded Message 
Subject: DreamHost Security Alert - Malware on landley.net
Date: Wed, 24 Apr 2024 09:53:09 -0700 (PDT)
From: DreamHost Abuse Team 
To: r...@landley.net

Hello Rob Landley,

We have received a report of malware at the following location:

hXXps://landley.net/aboriginal/downloads/old/binaries/1.2.6/cross-compiler-armv7l.tar.bz2

This means that your site has likely been compromised. We have taken the site
offline by renaming its directory (appended _DISABLED_BY_DREAMHOST). Please do
not re-enable it until you can address the problem.

In general, the three most common entry points for a compromised website are:

1. Vulnerable, typically out-of-date software (such as blogs, forums, CMS,
associated themes and plugins, etc.)
2. A cracked/brute-forced admin login for a web application like WordPress,
Joomla, Drupal etc.
3. A compromised FTP/SFTP/SSH user password.

1. All software you have installed under your domain should always be kept
up-to-date with the most recent version available from the vendors' website, as
these often contain security patches for known issues. Older versions of
well-known and popular web software (including Wordpress, Drupal, Joomla, etc.)
are known to have vulnerabilities that can allow injection and execution of
arbitrary code.

2. If you utilize a web application with a script-based administrative backend
(like WordPress, Joomla, or Drupal), make sure that you're not using a generic
username like "admin" or "webmaster" for the user with administrative
privileges. Hackers will slowly brute-force common usernames in order to get
access to a script's backend and whatever tools exist there that allow file
uploads, alterations, or execution of code.
3. FTP/SFTP/SSH passwords can be compromised and used to modify files. The most
important part of securing your account in this case is to change your FTP
user's password via the (USERS > MANAGE USERS) -> "Edit" area of the control
panel. Passwords should not contain dictionary words and should be a string of
at least 8 mixed-case alpha characters, numbers, and symbols. It is also
recommended to always use Secure FTP (SFTP) or SSH rather than regular FTP,
which sends passwords over the internet in plaintext. You can disable FTP for
your user(s) within the DreamHost panel (USERS > MANAGE USERS) section.

At this point, we recommend logging into your DreamHost server and removing the
content we listed. (Note: You may first need to reset the permissions). You
should also look for any other files/directories you did not upload yourself and
update all your website components where applicable. As for determining which
entry point is the cause of this incident, for 1 and 2, you can review the
Apache logs for suspicious activity and requests to suspicious files. Keep in
mind that we typically only keep around 5 days worth of Apache logs. For 3, you
can refer to this article to find recent logins to your user:
https://help.dreamhost.com/hc/en-us/articles/214915728-Determining-how-your-site-was-hacked

For further help on this topic, you can refer to our Knowledge Base:

https://help.dreamhost.com/hc/en-us/articles/215604737-Hacked-sites-overview
https://help.dreamhost.com/hc/en-us/sections/203242117-Logs

Lastly, we have scheduled an automated malware scan and if anything is found, we
will send you a separate email with those results.

If you need further assistance, please respond directly to this email.

Thank you for your cooperation!
-DreamHost Abuse Team
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] xxd: -d Decimal Lables flag, Don't cap at one file

2024-04-22 Thread Rob Landley
On 4/22/24 17:17, enh via Toybox wrote:
> ah, yeah, the _include_ path uses the full buffer and -r uses stdio
> buffering, but "regular" xxd was doing neither. i've sent out the
> trivial patch to switch to stdio.

Ah, performance tweak.

*shrug* Applied...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] xxd: buffer input via stdio.

2024-04-22 Thread Rob Landley
On 4/22/24 17:17, enh via Toybox wrote:
> ---
>  toys/other/xxd.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

What's the issue this fixes? It's not:

  for i in $(seq 1 100); do echo $i; sleep 1; done | ./xxd

Because that won't produce output for a couple minutes...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] shuf works on FreeBSD

2024-04-20 Thread Rob Landley
On 4/20/24 03:42, Vidar Karlsen via Toybox wrote:
> toys/other/shuf.c builds and runs on FreeBSD and can be enabled in
> freebsd_miniconfig with CONFIG_SHUF=y.
> 
> I can't think of a use case for it, but I'm sure there are some.

I thought it was enabled in commit 93c8ea40a back in November?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Microsoft github took down the xz repo.

2024-04-16 Thread Rob Landley
On 4/15/24 03:53, Jarno Mäkipää wrote:
> On Sun, Apr 14, 2024 at 9:14 AM Oliver Webb via Toybox
>  wrote:
>>
>> To revive a old thread with new technical info I stumbled upon:
>>
>> On Saturday, March 30th, 2024 at 15:58, Rob Landley  wrote:
>>
>> > I set up gitea for Jeff on a j-core internal server, and it was fine 
>> > except it
>> > used a BUNCH of memory and cpu for very vew users. Running cgi on 
>> > dreamhost's
>> > servers is a bother at the best of times (I don't want to worry about 
>> > exploits),
>> > and the available memory/CPU there is wind-up toy levels.
>> >
>> > My website is a bunch of static pages rsynced into place, some of which use
>> > xbithack to enable a crude #include syntax, and that's about what the 
>> > server can
>> > handle.
>>
>> Going through the list of "minimal tools" on https://suckless.org/rocks/,

Not really a fan of that site. I did a roadmap section on them long ago
(https://landley.net/toybox/roadmap.html#sbase), but I'm trying to implement
mostly compatible versions of things that already exist, and they're trying to
invent new things that didn't previously exist because https://xkcd.com/927/
which I mostly consider fragmentation rather than helping, and I try not to
encourage them.

>> I stumbled
>> upon a git frontend called stagit 
>> (https://git.codemadness.org/stagit/file/README.html)
>> which the suckless project uses as it's git frontend.

When microsoft bought github I mirrored my repo on my website so you could pull
it from there, but doing that doesn't have any web interface so I did a quick
and dirty bash script to upload the "git format-patch" of each commit, with
symlinks from the 12 character hash to the full hash (because doing _each_ one
was an insanely slow exercise in inode exhaustion).

You're once again telling me what I did was not good enough for you, and that I
am wrong, and must change to suit you.

>> But to have a solution, you must have a problem. The 2 main issues I have 
>> with the current git management
>> are the fact

I'm very tired.

>> there doesn't seem to be a way to clone the current repo directly from 
>> landley.net (Making Microsoft
>> GitHub the middleman).


$ git annotate www/header.html | grep -w git
fb47b0120   (Rob Landley2021-09-12 14:33:36 -0500   30)  
https://landley.net/toybox/git>local
$ git show fb47b0120
commit fb47b0120f7aa73c0821a8c55e15540d83baed01
Author: Rob Landley 
Date:   Sun Sep 12 14:33:36 2021 -0500

Add a local git mirror (todo item since github was acquired)...

diff --git a/www/git/index.html b/www/git/index.html
new file mode 100644
index ..bade8d1b
--- /dev/null
+++ b/www/git/index.html
@@ -0,0 +1 @@
+Not browseable: git clone https://landley.net/toybox/git

$ git log scripts/git-static-index.sh
commit 990e0e7a40e4509c7987a190febe5d867f412af6
Author: Rob Landley 
Date:   Sat Dec 24 06:34:11 2022 -0600

Script to put something browseable in https://landley.net/toybox/git

https://landley.net/notes-2022.html#22-12-2022

>> And the fact I can't browse the source code without github or android code 
>> search acting as
>> the middleman

I do not have source tree snapshots up. Kinda hard to do in a static manner
without uploading rather a LOT of files (and even if you upload each version of
"git log" for each file and create an index file for each commit with the ls -lR
of the whole tree linking to the relevant version, the URLs to the files are
ugly. I can do it, but don't really want to? Linking to individual lines of the
file while also having the raw text kinda implies uploading two versions and I
just dowanna. Oh, and dreamhost's server config doesn't have sane file
associations for all the types so if I put up a .c file it wants to DOWNLOAD it
instead of displaying it as text and trying to .htaccess that more of a pain
than I'm up for, so I would wind up having blah.c.txt and blah.c.html files and
that's just ugly...)

Plus, syntax highlighting: you'd THINK there would be some nice linux syntax
highlighting packages out there but not counting "use vi" (which doesn't work
for me anyway, :syntax = "E319: Sorry, the command is not available in this
version")...

Searching around I found https://github.com/alecthomas/chroma which is very
proud that it's written in "pure go"... except it's a wrapper for a python
library, and python's runtime is written in C, so DEFINE PURE...

Digging into the aforementioned python (don't get me started) library, the
"python-pigmentize" package installs the man page for a command "pygmentize",
and the bash completion for the command pygmentize, but does not install the
actual command in the $PATH (or anywhere

Re: [Toybox] df not working on FreeBSD

2024-04-16 Thread Rob Landley
On 4/15/24 04:37, Vidar Karlsen via Toybox wrote:
> Hello,
> 
> df throws the following error on FreeBSD:
> 
> root@140amd64_noopts-usrports:/usr/local/toybox/bin # ./df /
> df: getmntinfo: Invalid argument
> 
> A little bit of poking around shows that getmntinfo expects the second
> argument (the mode) to be one of these, and not 0:

Presumably it worked at one point, but I didn't write that bit...

> sys/sys/mount.h:
> #define MNT_WAIT1   /* synchronously wait for I/O to complete */
> #define MNT_NOWAIT  2   /* start all I/O, but do not wait for it */
> #define MNT_LAZY3   /* push data not written by filesystem syncer */
> #define MNT_SUSPEND 4   /* Suspend file system after sync */
> 
> Changing 0 to MNT_NOWAIT in portability.c makes df happy again.

And doesn't break macos, so I'm not adding the #ifdef in your patch. (I don't
have an openbsd test environment lying around, but
https://man.openbsd.org/getmntinfo.3 links to
https://man.openbsd.org/getfsstat.2 which says the options are MNT_WAIT and
MNT_NOWAIT so presumably they're happy too.

Commit 7d9ee89d3cf8.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] httpd: How is this supposed to be _used_?

2024-04-14 Thread Rob Landley
On 4/13/24 14:09, Oliver Webb via Toybox wrote:
> The first thing I ran into is that httpd doesn't do that by default,
> running "toybox httpd dist/" won't actually host those pages
> on localhost.

It's an inetd client:

  https://en.wikipedia.org/wiki/Inetd

  toybox netcat -s 127.0.0.1 -p 8 -L httpd .

I've been meaning to come up with an actual inetd, and possibly lib/*.c plumbing
to do standalone servers, but nommu support and rate limiting incoming
connections and so on all go in a layer I haven't implemented yet and am not
interested in reproducing in multiple commands.

Genericizing the plumbing I've already got in netcat, but making it available
from individual commands, implies having a standard set of command line
utilities that get exposed in commands to specify address to bind to and port to
listen on and max simultaneous connections (including max per source IP) and
output inactivity timeout Possibly some sort of IPSERVER macro flung into
the option string, with a corresponding structure in TT and then a function I
call? Or maybe just stick with inetd so it's somebody else's problem...

I explain this here periodically, by the way:
http://lists.landley.net/pipermail/toybox-landley.net/2024-January/03.html

In theory a tcpsvd was contributed to pending long ago, which kind of has the
od/hexdump/xxd problem of multiple implementations not sharing code (as I
periodically mention here, ala
http://lists.landley.net/pipermail/toybox-landley.net/2023-January/029410.html).
It's in the todo heap...

> "Why?": Looking at the source code and typing
> input into httpd, it wants input from stdin and seemingly outputs to
> stdout like a normal unix tool (which httpd is usually not).

As all inetd clients do, yes: nbd_server.c is another one. Lots of other things
(like the tftpd in pending, or dropbear) can work in inetd mode.

> Forgive me, but I'm going to compare this to busybox httpd.

You do you.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] today in "shut up, gnu!"

2024-04-13 Thread Rob Landley
On 4/12/24 13:24, enh via Toybox wrote:
> ~/aosp-main-with-phones$ find external/ -name NOTICE -type l -maxdepth 2
> find: warning: you have specified the global option -maxdepth after
> the argument -name, but global options are not positional, i.e.,
> -maxdepth affects tests specified before it as well as those specified
> after it.  Please specify global options before other arguments.
> 
> (it does do the right thing, but insists on whining first.)

I've hit that too, and am big into Not Doing That. Thought I'd blogged about it,
but it could have been irc, or twitter (which I deleted when twitler bought it
but have an archive I should probably post somewhere), or... probably too old
for mastodon?

There's a reason I get so exasperated about each new gnu/nag I stub my toe on.
It's gone beyond isolated incident into "pattern of looking down on everyone
else and sneering".

Unix has always been a silent protagonist, without which shell scripts are a
pain to do. If it doesn't work, they'll figure out why. Just behave consistently
(according to SOME kind of understandable logic) and let them keep the pieces.
Sometimes there's a -v flag to activate printfs() stuck into the code, but don't
express opinions when they didn't ASK. (Put them in the man page or --help if
it's that important.)

This has ALWAYS been the unix way. There are ALWAYS corner cases, and
deterministic behavior is not difficult to debug. The gnu/FSF never got that.
Stallman only decamped to unix under protest, a refugee from the Jupiter
project's collapse orphaning ITS, and he never really understood it.

RMS did not INVENT the idea of cloning unix with his big announcement in 1983.
Unix was a diverse community starting from the 1974 ACM article, let alone the
Berkeley Software Distribution in 1975. The first full from-scratch Unix clone
(writing their own kernel, compiler, and command line) was Coherent, which
shipped in 1980. Paul Allen copied subdirectories and file descriptors from unix
into DOS 2.0 not long after. Minix started in 1983 and shipped in 1986, and
Linux is 100% a descendant of Minix (developed on minix, its first filesystem
was minix, the development discussion on comp.os.minix, he inherited 80% of the
minix community because he took patches and Tanenbaum didn't...) There's a
famous tanenbaum-torvalds debate preserved for posterity, there is NOT a
stallman-torvalds debate because nobody cared what stallman had to say.

Nor did he invent freeware, which was the universal norm before the Apple vs
Franklin decision in 1983 because you couldn't copyright binaries before Steve
Jobs got the appeals court to change the law. Byte and Compute magazines had
basic listings in the back of each issue for you to type in, decus and CP/M
northwest had software libraries, the commodore 64 came bundled with a disk of
Jim Butterfield's software but he didn't WORK for them: he founded the Toronto
Pet User's Group (TPUG) and published free software with source code.

But Stallman mansplained at everyone else at the top of his lungs nonstop from
the moment he showed up, and there are all sorts of topics that can't NOT have
an "as opposed to what stallman's saying, the truth is" section today...

  https://en.wikipedia.org/wiki/Freeware

Sigh, watching https://youtu.be/2gOGHdZDmEk and https://youtu.be/WWfsz5R6irs and
https://youtu.be/9RO5ZAmzjvI every time the narration talks about Pierre Spray I
get Stallman vibes. There's a broadcast version of Dunning-Kruger where you
plausibly preach to an audience who doesn't know better, and become The Expert
that everybody must get a quote from every time something happens in that area,
while the people actually doing the work facepalm at every third word.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] uname no longer broken on FreeBSD?

2024-04-13 Thread Rob Landley
On 4/13/24 03:00, Vidar Karlsen via Toybox wrote:
> Hello,
> 
> toys/posix/uname.c builds and runs on Freebsd now. I have tested it on
> 13.2-amd64, 14.0-amd64, 13.2-arm64 and 14.0-arm64. I think it's safe to
> put CONFIG_UNAME=y back into kconfig/freebsd_miniconfig.

Ah, commit d2bada0e42e6 fixed it but I only remembered to add it to
macos_defconfig, forgot the other one.

Thanks,

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] prereq build, what is the motivation behind building od?

2024-04-12 Thread Rob Landley
On 4/8/24 14:20, Oliver Webb via Toybox wrote:
> Although I may be wrong, "od" doesn’t seem to be in 
> the build infrastructure. What’s the reason for it being a
> "prereq" command.

$ vi scripts/recreate-prereq.sh
...
$ grep '^od ' log.txt
od "-Anone" "-vtx1"

https://github.com/landley/toybox/blob/0.8.11/scripts/make.sh#L230

> Also, have you thought about specifying FILES through
> the command line to reduce build time by only building what we need to.

Have I thought about micromanaging the build in a way that may not link in
combination with a given set of generated/*.h files? Probably at some point.

Keep in mind I've been doing this stuff on and off since... depending on how you
want to look at it, 1999.

> Scanning
> for commands with “which”

"which" looks at what's installed on the host out of the $PATH. what does that
have to do with what's configured in toybox? (If I supplied an airlock I
specified the $PATH...)

> and maybe uname for stuff like gsed

You mean use uname to figure out if we're running on MacOS or FreeBSD like the
code already does in scripts/portability.sh?

> and putting them
> in FILES if we don’t have a good enough version.

I built defconfig under record-commands, and then did the standard "awk '{print
$1}' | sort -u | xargs" trick from literally _decades_ ago:

https://github.com/landley/aboriginal/blob/dbd0349d8ae6/sources/toys/report_recorded_commands.sh#L10

https://landley.net/aboriginal/FAQ.html#:~:text=logging%20wrapper

to get a list of the commands used by that, and used that to generate a toybox
.config file enabling those commands.

I then made a SHELL SCRIPT that DID ALL THAT so you could SEE HOW/WHY IT WAS
BUILT (and also so I could automate updating it, yes I should probably add it to
release.txt):

https://github.com/landley/toybox/blob/master/scripts/recreate-prereq.sh

And tried to explain that I'd done so:

https://github.com/landley/toybox/commit/d1acc6e88be5

And how to use the result:

https://github.com/landley/toybox/commit/3bbc31c78b41

> Then generating generated/
> files based off of that?

No.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] timeout.test: reduce flake.

2024-04-12 Thread Rob Landley
Catching up. (I let stuff pile up preparing for the release and then took a
couple days off, and now I'm at texas linuxfest doing sleep deprived talk prep
for tomorrow...)

On 4/8/24 15:28, enh via Toybox wrote:
> A (presumably overloaded) CI server saw the `exit 0` test time out.
> Given that several of these tests should just fail immediately,
> having a huge timeout isn't even a bad thing --- if we had a bug
> that caused us to report the correct status, but not until the
> timeout had _also_ expired, this would make that failure glaringly
> obvious.
> 
> Aren't the other tests with 0.1s timeouts potentially flaky? Yes,
> obviously, but I'll worry about those if/when we see them in real
> life? (Because increasing those timeouts _would_ increase overall
> test time.)

Yes it should never happen, but 11 minutes seems like a footgun.

I bumped it up to 1 second (10 times as long as before). If you see it again I
can bump it to 5 seconds, but much beyond 1 second and the "timeout -v .1 sleep
3" test later on gets flaky, as does:

toyonly testcmd "-i" \
  "-i 1 sh -c 'for i in .25 .50 2; do sleep \$i; echo hello; done'" \
  "hello\nhello\n" "" ""

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Release 0.8.11 is out.

2024-04-08 Thread Rob Landley
Yeah, a bit overdue. Lemme know if anything in the release notes isn't clear.

Still doing a texas linuxfest talk soonish. Hopefully they post a video
eventually...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] tail test failures?

2024-04-08 Thread Rob Landley
On 4/8/24 11:14, enh via Toybox wrote:
> looks like the github CI has been red for ubuntu and macOS since april 5th?
> 
> this revert fixes the current failing test:
> 
> [master 8368f8f9] Revert "Enforce min/max for % input type (time in
> seconds w/millisecond granularity)."
> 
> but that just gets me a different failing test, so it's obviously a
> bit more subtle than that :-)

Darn it, didn't get a release out on leap day, didn't get a release out during
the eclipse... Always one more thing.

(Pay no attention to the binaries I just uploaded, gotta rebuild them and do it
again. This is why I push the tag and update the news.html file on the website
LAST...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] utf8towc(), stop being defective on null bytes

2024-04-08 Thread Rob Landley
On 4/8/24 11:53, Oliver Webb wrote:
> Still, U+ is a valid code point, and having a special case especially for 
> it
> that isn’t mentioned but you have to watch out for is either a bug or a
> documentation error.

I say it's intentional, you reassert that I'm wrong.

I'll leave you to your opinion...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] utf8towc(), stop being defective on null bytes

2024-04-08 Thread Rob Landley
On 4/8/24 11:01, enh wrote:
>> > Returning length 0 means we hit a null terminator,
>>
>> Null bytes aren't always "terminators". You can embed null bytes into data 
>> and still
>> want to do utf8 processing with it.
> 
> that's questionable ... the desire to have ASCII NUL in utf-8
> sequences (without breaking the "utf-8 sequences are usable as c
> strings" property) is the main reason for the existence of "modified
> utf-8".

You don't need a conversion function to grab a nul byte, you can check if it's a
null byte.

That value _is_ a special case, the enclosing loop can deal with it easily
enough (there's nothing to convert, it's a NUL byte, check directly). I've got
functions like regexec0() that work over a range instead of using a NUL, and
those have to deal with libc's regex stopping at NUL so the enclosing loop
advances past it and restarts.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] utf8towc(), stop being defective on null bytes

2024-04-07 Thread Rob Landley
On 4/6/24 17:48, Oliver Webb via Toybox wrote:
> Heya, looking more at the utf8 code in toybox. The first thing I spotted is 
> that
> utf8towc() and wctoutf8() are both in lib.c instead of utf8.c, why haven't 
> they
> been moved yet, is it easier to track code that way?
The "yet" seems a bit presumptuous and accusatory, but given the title of the
post I suppose that's a given.

I have no current plans to move xstrtol() from lib.c to xwrap() And atolx() is
only called that instead of xatol() because it does suffixes.

The reason it had to go in lib.c back in the day was explained in the commit
that moved it to lib.c:

  https://github.com/landley/toybox/commit/6e766936396e

As for moving it again someday, unnecessarily moving files is churn that makes
the history harder to see, and lib/*.c has never been a strict division (more
"one giant file seems a bit much"). The basic conversion to/from utf8 is
different from caring about the characteristics of unicode code points (which
the rest of utf8.c does), so having it in lib.c makes a certain amount of sense,
and I'm not strongly motivated to change it without a good reason.

It might happen eventually because I'm still not happy with the general unicode
handling design "yet", but that's a larger story.

Way back when there was "interestingtimes.c" for all my not-curses code, but it
was too long to type and mixed together a couple different kinds of things, so I
split it into utf8.c and tty.c both of which were shorter and didn't screw up
"ls" columnization. (I probably should have called it unicode.c instead, but
unicode is icky, the name is longer, and half the unicode stuff is still in libc
anyway).

Unicode is icky because utf8 and unicode are not the same thing. Ken Thompson
came up with a very elegant utf8 design and microsoft crapped all over it (cap
the conversion range, don't add the base value covered by the previous range so
there are no duplicate encodings) for no apparent reason, and then unicode just
plain got nuts. (You had an ENORMOUS encoding space, the bottom bit could
totally have been combining vs physical characters so we don't need a function
to tell, and combining characters should 100% have gone BEFORE the physical
characters rather than after to avoid the whole problem of FLUSHING them, and
higher bits could indicate 1 column vs 2 column or upper/lower/numeric so you
don't have to test with special functions like that, just collage them into
LARGE BLOCKS which is LESS SILLY than the whole "skipping 0xd800" or whatever
that is for the legacy 16 bit windows encoding space that microsoft CRAPPED INTO
THE STANDARD... Ahem.)

But alas, microsoft bought control of the unicode committee, so you need
functions to say what each character is, and those functions are unnecessarily
complicated. In theory libc has code to do wide char conversions already, but
glibc refuses to enable it unless you've installed and selected a utf8-aware
locale (which is just nuts, but that's glibc for you).

I made some clean dependency-free functions to do the simple stuff that doesn't
care what locale you're in, but there's still wcwidth() and friends that depend
on libc's whims (hence the dance to try to find a utf8 locale in main.c, and the
repeated discussion on this list between me and Elliott and Rich Felker about
trying to come up with portable fontmetrics code. Well, column-metrics. Elliott
keeps trying to dissuade me, but bionic's code for this still didn't work static
linked last I checked...)

Moving stuff around between files when I'm not entirely satisfied with the
design (partly depending on libc state and partly _not_ depending on it) doesn't
seem helpful.

> Also, the documentation
> (header comment) should probably mention that they store stuff as unicode 
> codepoints,

Because I consistently attach comments before the function _body_ explaining
what the function does, instead of putting long explanations in the .h files
included from every other file which the compiler would have to churn through
repeatedly. In this case:

  // Convert utf8 sequence to a unicode wide character
  // returns bytes consumed, or -1 if err, or -2 if need more data.
  int utf8towc(unsigned *wc, char *str, unsigned len)

> I spent a while scratching my head at the fact wide characters are 4 byte 
> int's
> when the maximum utf8 single character length is 6 bytes.

Because Microsoft broke utf8 in multiple ways through the unicode consortium,
among other things making 4 bytes the max:

http://lists.landley.net/pipermail/toybox-landley.net/2017-September/017184.html

In addition to the mailing list threads, I thought I blogged about this rather a
lot at the time:

  https://landley.net/notes-2017.html#29-08-2017
  https://landley.net/notes-2017.html#01-09-2017
  https://landley.net/notes-2017.html#19-10-2017

Which was contemporaneous with the above git commit that added the function to
lib/lib.c. I generally find that stuff by going "when did this code show up
and/or get 

[Toybox] scripts/prereq/build.sh

2024-04-05 Thread Rob Landley
I recently added scripts/prereq/build.sh which runs a "cc -I dir *.c" style
build against canned headers. Theoretically a portable build not requiring a
system to have any command line utilities except "cc" and a shell. (Ok, you
still need bash to run scripts/make.sh and scripts/install.sh until toysh is
promoted. And until I replace kconfig, you still need gmake to run "make
defconfig", but I've got a design for that one now.)

Both that build.sh script and the saved scripts/prereq/generated headers are
created by scripts/recreate-prereq.sh which figures out what commands a toybox
build uses out of the $PATH (by doing a defconfig build under
mkroot/record-commands.sh), makes a .config file with just those commands
enabled and all dependencies switched off (and hardwires the two not-android
not-mmu symbols that get compiler probed), then strips down the resulting
headers to have just the symbols those commands need. (Well, I haven't stripped
down config.h yet but all the OTHERS are hit with sed/grep to remove stuff for
the commands that aren't enabled.)

Of course when I ran it on macos it went "boing":

toys/other/taskset.c:52:17: error: use of undeclared identifier
'__NR_sched_getaffinity'
toys/other/taskset.c:81:15: error: use of undeclared identifier
'__NR_sched_setaffinity'
toys/other/taskset.c:119:29: error: use of undeclared identifier
'__NR_sched_getaffinity'
3 warnings and 3 errors generated.

It's trying to build nproc, which scripts/make.sh uses out of the $PATH to query
available processors. And yes, nproc calls sched_getaffinity() on linux (even
the debian one, according to strace) which isn't really portable...

In theory, I've got some workaround code for nproc being unavailable in
scripts/portability.sh already:

# Probe number of available processors, and add one.
: ${CPUS:=$(($(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null)+1))}

I'm uncomfortable leaning in to "linux else bsd/mac" because I was also thinking
about stuff like qnx and vxworks and so on with the new "canned" build, but if
all the probes fail that becomes CPUS=$((+1)) and thus sets it to 1, which
should still work if I filter out nproc and sysctl isn't there either?

But I'd also like to build nproc for other targets if I could. Which sounds like
it turns into a portability.c mess pretty quickly...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Mkroot talk at texas linuxfest on the 13th.

2024-04-03 Thread Rob Landley
They posted the description. It's basically "45 minutes about mkroot":

https://2024.texaslinuxfest.org/talks/mkroot-tiny-linux-system-builder/

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] more gnu nonsense: cp -n

2024-04-03 Thread Rob Landley
On 4/2/24 01:35, Ryan Prichard wrote:
> Apparently upstream coreutils "cp -n" changed between 9.1 and 9.2, and the
> Debian maintainers reverted the change temporarily(?) and also added the
> "non-portable" error.
> 
> In coreutils 9.1 and older, "cp -n" quietly skipped a file if the
> destination existed, but as of 9.2, it instead prints an error and exits with
> non-zero at the end. (I saw some stuff about "immediately failing" on the 
> Debian
> bug, but AFAICT, cp keeps going and fails at the end.) It does look like the 
> new
> 9.2+ behavior matches "cp -n" on macOS (14.3.1) (and probably FreeBSD but I
> didn't test that).

In toybox, I tend to repeat an option to get that sort of behavior, so I'd do:

  cp -n thingy... - skip files, no error
  cp -nn thingy... - skip files, with error

That way the existing behavior doesn't change, and old versions that don't
understand the doubling still provide the old behavior (because cp -n -n = cp -n
by default) without erroring out on an unknown flag or consuming more namespace.

See toybox's "ls -ll" (shows nanoseconds) or "lsusb -nn" (numeric AND
non-numeric output) for examples. And yes, debian handles "ls -ll" just fine. :)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] more gnu nonsense: cp -n

2024-04-01 Thread Rob Landley
On 4/1/24 10:31, enh via Toybox wrote:
> hadn't seen this one before...
> 
> cp: warning: behavior of -n is non-portable and may change in future;
> use --update=none instead
> 
> (consider me skeptical that a system without -n is going to have
> --update=none...)

Define non-portable? Freebsd 14 has -n, macos has -n, busybox cp has -n, and of
course toybox (and thus android) has -n.

Meanwhile:

$ ./busybox cp --update=none one two
cp: option '--update' doesn't allow an argument
root@freebsd:~ # cp --update=none one two
cp: illegal option -- -
root@freebsd:~ # cp --update=none one two
cp: illegal option -- -
$ toybox cp --update=none one two
cp: Unknown option 'update=none' (see "cp --help")

Those clowns are explicitly advocating for a LESS portable option.

This is why I'm not removing "egrep", which is a shell wrapper on my devuan
system by the way:

$ which egrep
/bin/egrep
$ cat /bin/egrep
#!/bin/sh
exec grep -E "$@"

At least THAT one is easy for distributions to keep doing regardless of 
gnu/stupid.

If the solution for cp -n isn't "distro patches out the stupid", then "install
busybox cp" or just "use alpine". Spurious warnings from gnu are just that:
spurious.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Microsoft github took down the xz repo.

2024-03-30 Thread Rob Landley
On 3/30/24 15:16, Oliver Webb wrote:
> On Saturday, March 30th, 2024 at 15:06, Rob Landley  wrote:
>> FYI, Microsoft Github disabled the xz repository because it became
>> "controversial" (I.E. there was an exploit in the news).
>> 
>> https://social.coop/@eb/112182149429056593
>> 
>> https://github.com/tukaani-project/xz
> 
> They couldn't have removed commit access for the trojan horse and got on with 
> their lives?

Mastodon's been talking about this at length all day:

  https://mstdn.social/@rysiek/112184610302366603
  https://hachyderm.io/@dalias/112182128889536710
  https://cyberplace.social/@GossiTheDog/112184645230558304
  https://social.secret-wg.org/@julf/112184194797977290
  https://mastodon.social/@richlv/112180479433832095

And a lot of things the discussion was linking to went away. Oh well...

>> I'm assuming if toybox ever has a significant bug, microsoft would respond by
>> deleting the toybox repository. There's a reason that I have
>> https://landley.net/toybox/git on my website, and my send.sh script pushes to
>> that before pushing to microsoft github.
> 
> As much as it doesn't matter, I've wondered what git web frontend you use, 
> The html source for
> the massive table of commits doesn't give a copyright notice.

https://github.com/landley/toybox/blob/master/scripts/git-static-index.sh

https://landley.net/notes-2022.html#22-12-2022

> Do you just have a script make
> a table out of "git log"? Furthermore, have you considered using cgit or 
> gitea or another
> fancier git frontend for your own site?

I engaged with cgit at one point and found it overcomplicated and unpleasant.

I set up gitea for Jeff on a j-core internal server, and it was fine except it
used a BUNCH of memory and cpu for very vew users. Running cgi on dreamhost's
servers is a bother at the best of times (I don't want to worry about exploits),
and the available memory/CPU there is wind-up toy levels.

My website is a bunch of static pages rsynced into place, some of which use
xbithack to enable a crude #include syntax, and that's about what the server can
handle.

> There is also the issue of you not being able to push commits to the github 
> repo because
> github is forcing everyone to use 2FA.

I haven't been hit by that yet for some reason. I push from the command line
anyway (which is basically ssh), so if I lost website access I could presumably
still update the README to let people know where to go.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Microsoft github took down the xz repo.

2024-03-30 Thread Rob Landley
On 3/30/24 15:11, Rob Landley wrote:
> upstream of the xz-embedded repo with the public domain code I cloned is:
> 
>   https://git.tukaani.org/xz-embedded.git
> 
> Which is still available.

Although now that I look at it, a5390fd368f8 in september is the last commit
that wasn't from the backdoor guy anyway, so nothing new of interest.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Microsoft github took down the xz repo.

2024-03-30 Thread Rob Landley
FYI, Microsoft Github disabled the xz repository because it became
"controversial" (I.E. there was an exploit in the news).

  https://social.coop/@eb/112182149429056593

  https://github.com/tukaani-project/xz

I'm assuming if toybox ever has a significant bug, microsoft would respond by
deleting the toybox repository. There's a reason that I have
https://landley.net/toybox/git on my website, and my send.sh script pushes to
that _before_ pushing to microsoft github.

Luckily the xz guys don't seem to trust microsoft github either, because the
upstream of the xz-embedded repo with the public domain code I cloned is:

  https://git.tukaani.org/xz-embedded.git

Which is still available.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Clean up xz a good amount

2024-03-29 Thread Rob Landley
On 3/29/24 17:50, Oliver Webb wrote:
>> > ah, crap, that's another thing to put on the riscv64 to-do list...
>> > (thanks for bringing that to light!)
>> 
>> so, TIL that upstream already added a risc-v bcj implementation...
> 
> I always thought that the xz decompresser we use in toybox ("xx-embeded") and 
> the main
> one (The one with the CVE) were different projects (Separate git repos, one 
> is much slower
> than the other, etc).

The exploit was somebody checked a "test case" into the build system that hacked
the rest of the build with an x86-64 binary blob that linked before the other
functions?

https://youtu.be/jqjtNDtbDNI

I was only halfway paying attention once I was sure it didn't affect toybox. My
systems here use dropbear for ssh anyway, yes including my laptop. :)

> That being said, There are 0BSD licensed parts in the xz repo
> (one of SIX different licenses).

Huh, really? Cool...

>> (rob will of course be delighted to hear of systemd's involvement in
>> the exploit chain :-) )
> 
> Who would've known that a over-complicated, extremely large hairball with a 
> massive dependency chain
> that tries to consume _everything_ makes it easy to perform exploits.

Deleted long grumbling about adding complexity probably means you're _reducing_
security because the system is less auditable: a signing chain of custody is
still GIGO it just means it was delivered to you by TIVO with a mandatory EULA
so you can't personally FIX it...

Ahem. Tangent. Not going there.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Clean up xz a good amount

2024-03-29 Thread Rob Landley
On 3/29/24 17:28, enh wrote:
> On Wed, Feb 28, 2024 at 9:13 AM enh  wrote:
>> > > @@ -639,6 +640,20 @@ enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s, 
>> > > struct xz_dec_lzma2 *lzma2,
>> > >   */
>> > >  enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, char id)
>> > >  {
>> > > +  switch (id) {
>> > > +  case BCJ_X86:
>> > > +  case BCJ_POWERPC:
>> > > +  case BCJ_IA64:
>> > > +  case BCJ_ARM:
>> > > +  case BCJ_ARMTHUMB:
>> > > +  case BCJ_SPARC:
>> > > +break;
>> > > +
>> > > +  default:
>> > > +/* Unsupported Filter ID */
>> > > +return XZ_OPTIONS_ERROR;
>> > > +  }
>> > > +
>> > >s->type = id;
>> > >s->ret = XZ_OK;
>> > >s->pos = 0;
>>
>> ah, crap, that's another thing to put on the riscv64 to-do list...
>> (thanks for bringing that to light!)
> 
> so, TIL that upstream already added a risc-v bcj implementation...

I'm happy to call the public domain repo our "upstream" for this, but there's
still some collation damage (they have many files and we want either one or
two), and a lot of cleanup that could be done in our code that moves it farther
from their code.

As for whether we want one file or two: one model is the engine in the command
ala toys/*/bzcat.c and the other is lib/deflate.c called by toys/*/gzip.c (but
also available for other things to pull in without having to fork a child
process and pipe data through it). But the real difference there is deflate has
half an inflate already that I REALLY SHOULD FINISH (dictionary selection and
resets, everything else is just a question of doing the work) and xz compression
seems a bit out of scope. (Being able to read everything: yay. Being able to
compress data, gzip is the 80/20.

Modulo busybox refuses to build without bzip2 compression (I hit it until it
confessed in mkroot/packages/busybox.c but that broke all the help text), and I
did WRITE a cleaned up bzip2 engine many moons ago (reposted it here not to long
ago I think), so I _could_ have a lib/bzip2.c with a compression side if I
wanted to? Modulo the bzip2 compression side string sort logic never made sense
to me (what is the logic of falling back from one sort mechanism to the next,
why those in that order with those thresholds) so to test my engine I had to
block copy the original sort logic, which has licensing issues...

> ...but i only learned that because i was looking at
> https://www.openwall.com/lists/oss-security/2024/03/29/4 which was
> fascinating in many ways.
> 
> (rob will of course be delighted to hear of systemd's involvement in
> the exploit chain :-) )

I saw a youtube video on it, and it's been all over mastodon today. So much
unnecessary complexity. Adding layers to "solve" problems is painting over dry
rot. There are reasons I also want to simplify the build system itself, and care
so much about comparing the behavior across multiple platforms...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Poke about the bc.c cleanup patches I submitted a while ago

2024-03-28 Thread Rob Landley
On 3/27/24 08:31, Rob Landley wrote:
>>> ipcrm, ipcs,
...
>> I don't know how I'm supposed to test resources I have no way to create,
>> we'll need ipcmk eventually. These seem more feasible to test, although
>> their tests will fail under mkroot until we
>> have ipcmk
...
> To be honest, I'm tempted to clean up and promote them to "examples". Leaving
> them "default n". There in case somebody needs it, but if so it would be nice 
> if
> they could send us a note letting us know they exist...

I did a quick cleanup pass on ipcrm, but... yeah, I have no idea how to test 
this?

Also... what ARE keys vs IDs? I thought ID was a number and a key would be
arbitrary strings, because a key gets washed through a lookup function and an id
is just strtol(), but the code that's there does:

function(int key, char *name...)
{
...
  id = strtol(name, , 0);
  if (*c) {
error_msg("invalid number :%s", name);
return;
  }

  if (key) {
if (id == IPC_PRIVATE) {
  error_msg("illegal key (%s)", name);
  return;

IPC_PRIVATE is zero. So even if you set "key" to 1, strtol() has to consume the
whole thing or you get "invalid number" error and an abort before it even checks
key. There's no !key test around that first bit. And then right afterwards it
checks if the strtol() it did returned zero (IPC_PRIVATE is zero) and barfs if
it did, so even if that first part WAS a thinko with a missing test, it still
wouldn't work for anything that didn't at least START with a nonzero number.

So what's a "key"?

I did a "git log */ipcrm.c" over in busybox and there hasn't been a patch to it
from an actual USER of this command since it was introduced.

It's all code size shrink, compiler flag damage, white space fixes, help text
style updates, annotating with size estimates, NOEXEC, "make GNU licensing
statement forms more regular", "use can't instead of cannot", using EXIT_SUCCESS
and EXIT_FAILURE macros (really???), whatever "strtoul() fixes" was, and so on.
Churn for being a busybox applet, global search and replace over the tree.

No actual _user_ of the code has touched it since it was added to the tree, and
it turns out that was MY fault:

  commit 6eb1e416743c597f8ecd3b595ddb00d3aa42c1f4
  Author: Rob Landley 
  Date:   Mon Jun 20 04:30:36 2005 +

Rodney Radford submitted ipcs and ipcrm (system V IPC stuff).  They could
use some more work to shrink them down.

And in my defense, I had no idea what they WERE back then. That whole mess
started with a poke from some Qualcomm developers from India:

  http://lists.busybox.net/pipermail/busybox/2005-June/048807.html

Which led to a newbie looking for something to do asking how you submit new
commands to the project:

  http://lists.busybox.net/pipermail/busybox/2005-June/048828.html

And then two other devs piping up to show interest:

  http://lists.busybox.net/pipermail/busybox/2005-June/048847.html
  http://lists.busybox.net/pipermail/busybox/2005-June/048848.html

Which led to the patch.

So three people showed interest in 2005, resulting in a new dev porting the
commands from util-linux-2.12a, but none of them actually submitted anything
like a test case:

  http://lists.busybox.net/pipermail/busybox/2005-June/048851.html

So I have what I think is a cleaned up version but can't prove I didn't break
it, and I have no idea if 19 years after it was added to busybox and then (as
far as I can tell) completely ignored... anyone still cares?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] modeprobe.c and last.c: Codeshare identical llist_add()

2024-03-27 Thread Rob Landley
On 3/26/24 15:05, Oliver Webb via Toybox wrote:
> 2 identical versions of the same function, variable names and everything
> 
> 31 bytes saved in bloatcheck

The problem being it moves code from pending/ to lib/ whose only users are in
pending.

I've generally just done singly linked list additions inline. When you don't
mind reversing the list order it's literally two assignments and a dereference;

  node->next = head;
  head = node;

Pushing two arguments onto the stack and making a function call is approximately
as much code. (When I want to preserve list order I tend to use the existing
doubly linked list functions.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-27 Thread Rob Landley
On 3/25/24 20:24, enh wrote:
> But "dpkg-query -S $(which $NAME)" is pretty easy to do the mapping 
> yourself on
> debian...
> 
> 
> (yeah, though i suspect anyone trying to do this hypothetical "swap package $X
> for toybox" would want the _opposite_ mapping, from package name to all the
> commands. and i don't know of a way to ask apt that question?

  $ dpkg-query -L tar | grep bin/
  /bin/tar
  /usr/sbin/rmt-tar
  /usr/sbin/tarcat

> other than
> brute-forcing all of the executables in all of the directories in $PATH, 
> anyway.)

Checking the $PATH would be clever but the above covers it for me.

There are some insane packages which crap binaries under /usr/lib, such as
/usr/lib/libreoffice/program/oosplash or /usr/lib/man-db/manconv and generally I
consider these packages to be maintained by madmen.

I mean honestly:

  $ cat /usr/bin/7z
  #! /bin/sh
  exec /usr/lib/p7zip/7z "$@"

Why would you do that? Why would ANYONE voluntarily DO that?

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-27 Thread Rob Landley
On 3/25/24 20:20, enh wrote:
> On Sun, Mar 24, 2024 at 12:45 AM Rob Landley  wrote:
> On 3/22/24 10:24, enh wrote:
> > On Thu, Mar 21, 2024 at 8:45 PM Rob Landley  wrote:
> >> Anyway, toys/android basically meant (to me), "commands that come from
> and are
> >> maintained by Elliott which I can't even test because they don't apply 
> to a
> >> vanilla linux system that isn't running the full android environment".
> Although
> >> that's a personally idiosyncratic definition because I lumped selinux 
> in with
> >> that;
> >
> > (heh. you beat me to it :-) )
> 
> If the new kconfig greyed out unavailable entries and had a status line 
> saying
> "depends on TOYBOX_ON_ANDROID" or similar when you cursored over a greyed 
> out
> entry...
> 
> ah, as the kind of lunatic who only ever edits these files by hand with vi, 
> i'd
> actually just assumed that was kind of the whole point of the _existing_ 
> kconfig
> stuff?

To me half the point is it's the same UI as configuring the linux kernel,
busybox, and buildroot. Meaning A) a bunch of people out there are familiar with
it already, B) presumably the worst sharp edges have been filed off over the
past 15 years.

> (to be fair,  i did launch it once, but saw it was a ridiculously deeply 
> nested
> ui [and not expanded by default?], and thought "i don't understand the purpose
> of this", couldn't see how to search,

It literally has help text at the top of the screen.

Forward slash is search, cursor up and down, space to toggle the highlighted
thingy, enter to go into a menu, ESC to back out again, ESC from the top level
to exit (it prompts you whether or not you want to save), ESC twice from _that_
to abort the exit.

There's also a menu at the bottom, where if you cursor left and right it
highlights different things, and the ENTER will do that thing instead. (The
default is "select". I cursor right to "help" and hit enter because I never
remember that ? is the hotkey for that.)

Mostly I'm assuming "same UI as linux kernel" is like 2/3 of the userbase 
though.

> and immediately went back to editing by
> hand. at least that way i only need to know how to use my editor, which i need
> to know regardless :-) )

Dependency resolution comes to mind.
> If we really wanted to rush this, I could make a TOYBOX_UNFINISHED symbol 
> that
> the pending stuff could depend on, and then the blocker is the kconfig
> replacement...
> 
> no, i've been cursing the broken tab-complete for -- wow, almost a decade now!
> -- so i think i can survive :-)

I admit I sometimes do "ls toys/*/skel* when I can't remember whether I called
it "example" or "examples".

> Not THIS release though. Working on release notes! (And lowering my 
> standards on
> the todo list.)
> 
> indeed... something that benefits the handful of folks working on toybox isn't
> worth much compared to something that benefits the users!

Working on it...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-27 Thread Rob Landley
On 3/24/24 10:15, Oliver Webb wrote:
> On Sunday, March 24th, 2024 at 04:09, Rob Landley  wrote:
>> On 3/24/24 01:00, Oliver Webb wrote:
> 
>> This isn't the hard part. To me, the hard part is wanting to share lib/.c 
>> code
>> with this new binary, which implies it would live in toys/example/.c, which
>> means in the NEW design it would be a normal command that's "default n"... 
>> and
>> maybe depends on TOYBOX_BUILD or some such? Except moving stuff from 
>> scripts/.c
>> to toys/.c is conceptually ugly. But if we're getting rid of the
>> subdirectories... Maybe make.sh needs to be able to build commands that DON'T
>> live in toys/ but then...
> 
> There is a chicken and egg problem with the build infrastructure and kconfig 
> being a toy,

Yes, I know. That's why I've avoided it up until now.

> We need a .config file to build toys, and parsing the help text requires some 
> kconfig
> parser, But we can't make a .config file until we have kconfig.

You don't need a .config file to build lib/*.c (policy, and why lib/lib.h is
separate from toys.h).

I'm talked before about doing packaged minimal headers to build "sed" and "sh"
standalone, as part of toybox airlock stuff. (Possibly the full airlock command
list needed to build toybox.) Ones that assume all the config probes failed and
$LIBRARIES is empty and so on.

The EASY way to do that is to have a scripts/shipped/generated with handcrafted
headers files, and then stick -I scripts/shipped at the start of $CFLAGS.

The hard way involves more cleanup so there are fewer header entry points, and I
could have a single "hairball.h".

The individual toys/*/*.c files only #include toys.h and generated/flags.h
(which could be a re-include of toys.h with a little #ifdef cleverness). The
rest are all included from toys.h and main.c.

The above #ifdef cleverness could wrap the generated/ includes in toys.h in a
__has_include("hairball.h") or similar, so I could provide a single replacement
file with the collated stuff I need for specific commands to build in a way that
assumes the host system has no brain, and then generated/build.sh it. The
problem is the includes are in two places: "generated/config.h" comes before
lib/portability.h (which comes before everything else so it can override
standard header #includes), and then the rest of the generated/*.h are after a
#define NEWTOY() and OLDTOY() so there's some reordering to do to combine them
all into one header.

(I don't think any of the generated/*.h files care about stuff in portability.h?
There would be various structs used before they were defined in
generated/globals.h if it got moved up, but that _should_ be ok? Nothing takes
the sizeof() them or similar that early. Eh, I should be able to work it out,
just haven't sat down to try yet. Far too many already open cans of worms...)

And then there's the #includes in main.c, the other half of the
declaration/definition pair for various global data: that needs newtoys.h,
help.h, and zhelp.h. One of which is already chopped out by a config option, the
second of which just needs a way to stub it to "", and which leaves newtoy.h to
address...

> The solution I thought of was to use the infrastructure that we will have to 
> have to remove
> bash and gsed dependencies to build kconfig as a early step in the process.

No.

>  But then we will
> still need to extract the help text.

config TOYBOX_HELP
bool "Help messages"
default y
help
  Include help text for each command.

You can configure help out entirely. (This isn't CONFIG_HELP the command, this
is "the help subsystem" in the toybox general settings menu.) This was 
intentional.

> Do you plan on not keeping 2 different kconfig parsers or moving scripts/*.c 
> to toys/example

Look up at the first paragraph of mine you quoted in this email.

It's an open question, but stripping down a "cc -I scripts/prebuilts main.c
lib/*.c toys/*/{abc,def,ghi,jkl}.c"  build so it could provide commands with
nothing but a compiler would be a step towards that.

Modulo that "cc *.c" doesn't parallelize across processors because C++
developers took over compiler development about 2 years after the Core Duo hit
the market and brought SMP to the cheap retail mainstream, at which point making
compilers better rather than merely more complicated hit a sudden brick wall.

And thus even on my ancient 4x laptop:

$ time make clean defconfig toybox
...
real0m16.170s
$ time generated/build.sh
...
real0m27.474s

I don't want to significatly slow down the build by compiling prerequisites? In
theory:

$ time gcc -I . main.c lib/*.c -o blah
...
real0m1.780s

(Yeah exits with a link error but that's not the point.)

And I mean yeah, 2 seconds, not that big a deal. But I'd pr

Re: [Toybox] hexdump tests.

2024-03-27 Thread Rob Landley
On 3/25/24 10:42, enh wrote:
> On Sun, Mar 24, 2024 at 1:40 AM Rob Landley  wrote:
>>
>> On 3/22/24 15:02, enh wrote:
>> >> > CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION=1? so we trust ourselves but 
>> >> > no-one
>> >> > else? :-)
>> >>
>> >> I _don't_ trust myself, and I'm not special. (That's policy.)
>> >
>> > yeah, but that's why i suggested
>> > CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION --- that way we can say "we
>> > can't make hard assertions about the _host's_ whitespace, but we can
>> > still make hard assertions about _ours_". if we just canonicalize all
>> > the whitespace all the time, we can't (say) ensure that columns line
>> > up or whatever.
>>
>> Or we could just "NOSPACE=1 TEST_HOST=1 make tests" if that's the test we 
>> want
>> to run...?
> 
> it's not though. that's my point. there are several cases:
> 
> 1. testing toybox --- we know what whitespace we're expecting to
> produce, and want tests to protect against regressions.
> 
> 2. testing host tools --- we _don't_ have control over what whitespace
> the host produces.
>   a) in some cases we manually mark individual tests to show "we don't
> care about host whitespace for this test case".
>   b) sometimes this applies to _all_ the tests for a toy.
> 
> we're talking about case 2b here, which is currently the
> least-well-supported variant.

You can NOSPACE=1 in an individual tests/command.test and it should last until
the end of the file? That's why scripts/test.sh does:

  # Run command.test in a subshell
  (. "$1"; cd "$TESTDIR"; echo "$FAILCOUNT" > continue)

So the variables and functions and so on defined in one test don't leak into
others. I spent like 3 commits getting that to work properly, the last of which
was commit 07bbc1f61280 and mentions the previous 2.

> i think we're talking at cross purposes because _i'm_ talking about
> variables set _within the tests, by the tests themselves_ and you're
> talking about variables set on the command-line, which i don't think
> make any sense here, because we're talking about properties of the
> individual tests/commands.

There are three scopes:

1) Variables exported into all tests

POTATO=1 make tests

2) Variables set for a single test:

POTATO=1 testcmd "thingy" "-x woo" "expected\n" "file" "stdin"

3) Variables set for the current test file.

[ -n "$TEST_HOST" ] && NOSPACE=1

Which is just a normal assignment (or export) in a tests/file.test, they go away
at the end of the current file (because of the above parenthetical subshell
calling it), and which was the new thing I added in 2022.

I remember my first attempt at this years ago ctrl-c didn't work reliably, but
the fix to that was just a trap at the top of scripts/test.sh:

  # Kill child processes when we exit
  trap 'kill $(jobs -p) 2>/dev/null; exit 1' INT

> (unless you really do want to say "there's absolutely nothing we can
> do about host whitespace, so give up completely", which i think has
> yet to be proven that it's _that_ bad. but there are commands where
> having a test that says "this whitespace -- that toybox produces -- is
> reasonable [but as long as the non-whitespace matches, and there's
> _some_ whitespace everywhere we have whitespace, we'll accept any
> whitespace from the host tool]".)

I think per-command [ -n "$TEST_HOST" ] && NOSPACE=1 might be reasonable. I'd
rather not blanket do it for all commands.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Poke about the bc.c cleanup patches I submitted a while ago

2024-03-27 Thread Rob Landley
Yesterday I did NOT spend all my energy reading email, and instead got
https://landley.net/bin/toolchains updated with a musl 1.2.5 and or1k and riscv
in the list, and that seems to have fixed the sh2eb build break as well
(although I haven't tried booting it on a Turtle board yet, haven't unpacked any
here in Minneapolis...) and rebuilt all the mkroot targets against the 6.8
kernel (the tmpfs patch went upstream-ish but the rest all still apply, none of
those issues will ever voluntarily be fixed by the kernel clique), and the tests
told me I need kernel/qemu configs for armv4l armv7m microblaze mips64 riscv32
riscv64 sh4eb, which reminded me of my "make the fdpic loader work on sh4 with
mmu work" which should become another patch and get finished now that I've got
updated toolchains with the sh4 longjmp bug fixed...

But today I'm being good and back to spending my energy responding to email 
instead.

On 3/24/24 21:45, Oliver Webb wrote:
> On Sunday, March 24th, 2024 at 18:27, Rob Landley  wrote:
> 
>> > I've been looking to do a cleanup pass on bc because there are a lot of 
>> > very obvious things
>> > that can be removed (typedefed structs as far as the eye can see, all the 
>> > "posixError" garbage,
>>
>> Agreed. I still haven't decided whether to throw it out and start over, but 
>> you
>> can't make it worse. (Your cleanup patch broke xzcat, but I can't tell if 
>> this
>> one is right or wrong outside of its test suite already, and only really care
>> about the kernel timeconst.bc use case anyway, so...)
> 
> Permission to remove the annoying signal handling that only really matters 
> (gets in the way of exiting) 
> on interactive sessions?

"You can't make it worse."

>> Why typecast at all? You're assigning to a variable of that size, shouldn't 
>> the
>> typecast do the assignment? (Does this suppress a warning or something?)
> 
> I did ":%s/uchar/char/g" instead of going over every individual use of 
> "uchar",
> This patch (attached) removes a lot of those unnecessary typecasts, and 
> cleans up
> the code formatting a lot, among other things like getting rid of the 
> posixError stuff,
> about 350 lines removed
> 
>> Is sizeof(char) ever not 1?
> 
> There is support for multi-byte chars in gcc (i.e. "char x = 'ABCD';")

That's a character literal (which has a return type int), not a char variable.
Assigning it to a char will give you... I'm going to guess 'D'.

> but noone uses that terrible extension from my knowledge

It seems to warn about using it by default, even:

$ cat test2.c
#include 

int main(int argc, char *argv[])
{
  char c = 'ABCD';

  printf("%d\n", c);
}
$ gcc test2.c
test2.c: In function ‘main’:
test2.c:5:12: warning: multi-character character constant [-Wmultichar]
   char c = 'ABCD';
^~
test2.c:5:12: warning: overflow in conversion from ‘int’ to ‘char’ changes value
from ‘1094861636’ to ‘68’ [-Woverflow]
$ ./a.out
68

>> > or the xz stuff,
>>
>> If you want to peel out individual upstream public domain xz patches and 
>> adapt
>> them (one at a time) to apply to toybox's xzcat, I'd be very interesting in
>> reading and applying the results.
> 
> The main problem is that it takes a lot of work to patch upstream stuff and 
> not break everything,
> I'll see what I can do, but I can't guarantee that I'll be able to get the 
> bigger blocks of code
> like the ARM64 decoder in.
> 
>> > nor the csplit regressions I started to patch out,
>>
>> What were the csplit regressions?
> 
> A lot of things since I was testing the command manually when I first wrote 
> it,

A test suite that TEST_HOST passes would be nice. I have the start of one, but
csplit is such an utterly terrible command (a half-assed sed that only wants to
write to files), I can't wrap my head around what anybody would ever WANT to use
it for.

I mean why have "prefix" and "suffix" when suffix is an arbitrary sprintf
string? Prefix on WHAT, it's not adding in the input filename, and you can't if
you try:

$ seq 1 10 | csplit - 2 %4% 7 -b '%s'
csplit: invalid conversion specifier in suffix: s

I checked busybox to see if they had tests, but the only mention of csplit in
the entire git tree there is docs/posix_conformance.txt under "Tools not 
supported".

>> Glancing at pending, I don't have a test environment for
>> arp, arping,
> 
> Networking administration stuff for ARP caches that can manipulates kernel 
> ARP table entries,
> would probably require mkroot to test safely.

Yes, I know.

>> bootchartd,
> 
> A command with no standard; Described as "bootchartd is commonly used to 
> profile the boot process.&

Re: [Toybox] test.sh: Don't override "C" command path in TEST_HOST if it's set

2024-03-24 Thread Rob Landley
On 3/24/24 18:40, Rob Landley wrote:
>> Also, different command names, there's a dozen different vi implementations 
>> and 
>> only a few have the name "vi". This is true for some other commands as well
> 
> I've been doing:
> 
>   mkdir sub
>   ln -s $(which potato) sub/vi
>   PATH=$PWD/sub:$PATH make tests
> 
> Comes up a bit already, such as testing toybox tar --xform which requires 
> toybox
> sed, and thus even the standalone test skips those unless you put toybox sed 
> in
> the $PATH.
> 
> In theory you could PATH=$PWD/sub:$PATH TEST_HOST=1 make test_vi above, in 
> which
> case "C" should wind up pointing into sub...

P.S. I don't want to commit to there still BEING a "C" a year from now. That's
an internal implementation detail, not an API.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] test.sh: Don't override "C" command path in TEST_HOST if it's set

2024-03-24 Thread Rob Landley
On 3/22/24 16:10, Oliver Webb wrote:
>> On 3/21/24 21:38, Oliver Webb via Toybox wrote:
>> 
>> > A mildly annoying issue of you are trying to test with different 
>> > implementations of commands
>> > such as plan9 ones or sbase or busybox ones, things with different 
>> > conflicting implementations
>> > of things like xxd or vi. With this patch you can do "make test_cmd 
>> > TEST_HOST=1 C=/path/to/other/cmd"
>> > and have it work
>> 
>> I've been doing "PATH=/path/to/thingy:$PATH TEST_HOST=1 make test_cmd" for
>> years, I didn't know that needed to be documented...
> 
> plan9 has a incompatible diff implementation, which means to test plan9 utils 
> I'd
> either need to separate diff from the rest of the binaries or have some way 
> of overriding "C".
> 
> Also, different command names, there's a dozen different vi implementations 
> and 
> only a few have the name "vi". This is true for some other commands as well

I've been doing:

  mkdir sub
  ln -s $(which potato) sub/vi
  PATH=$PWD/sub:$PATH make tests

Comes up a bit already, such as testing toybox tar --xform which requires toybox
sed, and thus even the standalone test skips those unless you put toybox sed in
the $PATH.

In theory you could PATH=$PWD/sub:$PATH TEST_HOST=1 make test_vi above, in which
case "C" should wind up pointing into sub...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-24 Thread Rob Landley
On 3/24/24 01:00, Oliver Webb wrote:
> I've done some research on this too, we have no "select" statements in any of 
> our config symbols,

for a definition of "we" that is "I have intentionally not merged any", since I
review and approve all the kconfig command sections in the headers and have been
tracking that. (At one point the config2help.c stuff was trying to stitch
dependencies together to merge help text, and didn't understand complicated 
syntax.)

That said, forking the kconfig language definition is not something I do
lightly. Ours has fallen way behind the kernel's, and thus looks like something
else but is only compatible with a subset of it. We are about to _shrink_ that
subset. This needs a FAQ entry at least.

> but we do have a fair amount of that ""SYMBOL && (SYMBOL||SYMBOL)"" 
> expression processing that's
> annoying to deal with.

I was referring to that, yes. I need to implement processing for it. I've
already implemented such processing in find, test, and twice in toysh (both
command && command and $((math&)) ).

> Also a "choice" block and a few number ranges in the main Config.in we will
> need to deal with in some way, the depends/selects stuff seems easy but with
> that expr evaluating probably isn't

Yes, I know.

> I tried to write a kconfig parser (As a toy to make the codesharing easier)

I've written at a bunch, and mostly thrown them away again. There's a simple one
in scripts/config2help.c and wrote one in python at
https://landley.net/hg/kdocs/file/tip/make/menuconfig2html.py which generated
https://landley.net/kdocs/menuconfig/ way back when. (Those are the only two
published ones that come to mind, but I've written more over the years.)

> and got absolutely nowhere. The approach I took to it was...

This isn't the hard part. To me, the hard part is wanting to share lib/*.c code
with this new binary, which implies it would live in toys/example/*.c, which
means in the NEW design it would be a normal command that's "default n"... and
maybe depends on TOYBOX_BUILD or some such? Except moving stuff from scripts/*.c
to toys/*.c is conceptually ugly. But if we're getting rid of the
subdirectories... Maybe make.sh needs to be able to build commands that DON'T
live in toys/ but then...

Unanswered design questions looming here, have not been jigsawed into an elegant
picture yet. (How much of that is assembling pieces and how much is SAWING THEM
UP I don't know yet...)

Anyway, it seems like config2help.c should also share this plumbing if it's
parsing the kconfig input anyway, which is convenient since I've been meaning to
rewrite all that too (and yes THAT has a motivating "somebody is waiting for me
to fix this", ala https://github.com/landley/toybox/issues/458 ), but there's
also the usage: line regularization
(https://landley.net/notes-2023.html#06-11-2023) and fixing the remaining
sub-options with maybe some sort of help text include syntax for inserting other
help texts at controllable points (as either blogged about or mentioned here on
the list, I'd have to check my notes to see where I left off on that)...

Once I've got the design worked out, coding it is usually the easy part.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] hexdump tests.

2024-03-24 Thread Rob Landley
On 3/22/24 15:02, enh wrote:
>> > CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION=1? so we trust ourselves but 
>> > no-one
>> > else? :-)
>>
>> I _don't_ trust myself, and I'm not special. (That's policy.)
> 
> yeah, but that's why i suggested
> CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION --- that way we can say "we
> can't make hard assertions about the _host's_ whitespace, but we can
> still make hard assertions about _ours_". if we just canonicalize all
> the whitespace all the time, we can't (say) ensure that columns line
> up or whatever.

Or we could just "NOSPACE=1 TEST_HOST=1 make tests" if that's the test we want
to run...?

>> Erik did lash (lame-ass shell) to be tiny, Ash was the bigass lump of 
>> complexity
>> copied out of debian or some such and nailed to the side of the project by 
>> that
>> insane Russian developer who never did learn english and communitcated 
>> entirely
>> through a terrible translator program (so any conversation longer than 2
>> sentences turned into TL;DR in EITHER direction, he was also hugely 
>> territorial
>> about anybody else touching "his" code), and msh was the minix shell mostly 
>> used
>> on nommu systems.
> 
> did lash _stay_ tiny?

Yes, but it was also borderline unusable.

> i feel like the trouble with projects like that
> is usually that no-one can agree on what's necessary versus bloat, so
> you trend towards just being a bad implementation of whatever. iirc
> inferno had _two_ different "tiny" shells.

Erik implemented something tiny for his own personal use, and ignored everybody
else who tried to add stuff to it.

When Erik moved on, I studied it. When I moved on, Bernhard removed it:

  https://git.busybox.net/busybox/commit/?id=96702ca945a8

>> > because, to be fair to the confused, in english
>> > "pending" _can_ legitimately mean "almost there". whereas your whole point 
>> > with
>> > pending is "i actually have _no_ idea how close this is yet".
>>
>> Linux has drivers/staging but I didn't like that.
> 
> yeah, "staging" also sounds very much like "nearly there!".

The problem is motivated reasoning. We could call the directory
instant_death_do_not_touch and people would still enable stuff in it to see if
it worked for them. (And then ship it when it Worked For Them.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-24 Thread Rob Landley
On 3/22/24 10:26, enh wrote:
> On Fri, Mar 22, 2024 at 8:24 AM enh  wrote:
>> (tbh, just merging "lsb" into "other" would be a step forwards. wtf
>> is/was "lsb" anyway? and while i can _usually_ guess "POSIX or not?"
>> correctly, "lsb or other" is impossible by virtue of being
>> meaningless.)
> 
> (and to be clear, although "lsb" is particularly obscure, i think this
> is the same problem busybox's organization has: why do i have to care
> whether something is in coreutils or linux-utils or procps? how is
> that relevant to me?

There's a reason I didn't use that as an organizing method. Although I did try
to map them at the end of the roadmap, and need to redo that analysis now since
it's been a while...

> the best answer i can think of is "because i want
> to only use toybox/busybox to replace _that_ package", but i don't
> think the _directory structure_ helps there, right? that hypothetical
> person actually wants more metadata in the kconfig part of the comment
> inside each file?)

That's the theoretical use, yes. So distros (and system builders like gentoo,
buildroot, yocto, etc) can annotate package alternatives so if you want to
install busybox's tar instead of gnu tar your package management system could
cope. In practice, making something like dpkg handle that was near impossible,
and buildroot only did it because the maintainer of busybox created buildroot. I
tried to add toybox to buildroot years ago and...

https://lists.buildroot.org/pipermail/buildroot/2014-September/409298.html

People still try from time to time:

https://lists.buildroot.org/pipermail/buildroot/2017-January/181960.html
http://lists.busybox.net/pipermail/buildroot/2022-September/652474.html

But even a build system that ALREADY lets you swap in/out buildroot vs gnu
versions of packages accomplished that by hardwiring busybox support deep into
its build system.

Getting something like debian to do that on the fly is... it's not really
designed for it.

I can think of better ways to do it (and am studying debian's build system in my
copious free time), but I've been busy with other things and most people aren't
motivated to try...

I note that I did it by hand back when creating aboriginal linux, which is what
led me to maintaining busybox in the first place, ala:

https://landley.net/aboriginal/old/

> When the Firmware Linux project started, busybox applets like sed and sort
> weren't powerful enough to handle the "./configure; make; make install" of
> packages like binutils or gcc. Busybox was usable in an embedded router or
> rescue floppy, but trying to get real work done with it revealed numerous
> bugs and limitations.
> 
> Busybox has now been fixed, and in Firmware Linux Busybox functions as an
> effective replacement for bzip2, coreutils, e2fsprogs, file, findutils, gawk,
> grep, inetutils, less, modutils, net-tools, patch, procps, sed, shadow,
> sysklogd, sysvinit, tar, util-linux, and vim. (Eventually, it should be
> capable of replacing bash and diffutils as well, but it's not there yet.)

That's the old page from before I restarted the project and renamed it
Aboriginal Linux (based on QEMU instead of User Mode Linux, ala
https://landley.net/notes-2005.html#27-10-2005). Before that I was going though
the Linux From Scratch package list and _disposing_ of gnu packages, one by one,
as I got busybox to replace them.

But "dpkg-query -S $(which $NAME)" is pretty easy to do the mapping yourself on
debian...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-24 Thread Rob Landley
On 3/22/24 10:24, enh wrote:
> On Thu, Mar 21, 2024 at 8:45 PM Rob Landley  wrote:
>> Anyway, toys/android basically meant (to me), "commands that come from and 
>> are
>> maintained by Elliott which I can't even test because they don't apply to a
>> vanilla linux system that isn't running the full android environment". 
>> Although
>> that's a personally idiosyncratic definition because I lumped selinux in with
>> that;
> 
> (heh. you beat me to it :-) )

If the new kconfig greyed out unavailable entries and had a status line saying
"depends on TOYBOX_ON_ANDROID" or similar when you cursored over a greyed out
entry...

There _is_ a way to collapse everything together into one directory and make it
manage-ish-able. But there are currently 52 command files in pending, and "ip.c"
alone is 6 commands and 3000 lines of "we already have route and ifconfig and
iptables and so on as separate commands, why did they do it again?"

>> It's been the status quo for a dozen years now (commit 3a9241add947 in 2012) 
>> and
>> moving everything AGAIN would have costs, so I'd want a reason and assurance
>> that we're not going to change our minds again.
> 
> for me the holy grail is "tab complete works and i don't have to think
> about arbitrary partitions".

It's a good point.

> i think "not yet default 'y'" is pretty
> defensible (though the reason we're having this discussion is because
> people _don't_ read "pending" as "danger, keep out!"), but the rest
> seem so arbitrary.

I'd like there to not BE "danger, keep out" in the tree, but a certain large
korean company wanted their contributions checked in, I fell behind, and it
snowballed from there.

>> Collapsing the directories
>> together when the last command is promoted (or deleted) out of pending might
>> make sense, figuring out what to do about example/ (trusting to the demo_ 
>> prefix
>> to annotate the example commands is nice, but hello.c hostid.c logpath.c and
>> skeleton.c would need... something).
> 
> no, i think example/ is defensible too. (i'd argue you're only ever
> going to look in there if you have a _reason_ to. or you've done a
> `grep -r` for something you're changing/checking all references to.
> the reason i completely forgot about example/ is that it never causes
> me the "where the fuck is _mount_?!" annoyance :-) )

Right now everything is at the same level. Having files at two different levels
is not a simplification.

Designing a way to have toys/*.c with no subdirectories and make it manageable
seems a reasonable goal, if tricky to get to. Having toys/*.c _and toys/*/*.c
does not smell like an improvement?

We've got: android  example  lsb  net  other  pending  posix

Pending needs everything cleaned up and prompted or deleted. Posix can be a
defconfig file. Example can be commands that "default n". Android isn't
necessary if a kconfig replacement greys things out instead of hiding them and
displays WHY they're greyed out when you cursor over them (and the rewrite is
needed to address pull request 332). Other, net, and lsb aren't sufficient
distinction to persist in the absence of other directories.

And that's all of them, I think?

If we really wanted to rush this, I could make a TOYBOX_UNFINISHED symbol that
the pending stuff could depend on, and then the blocker is the kconfig
replacement...

Not THIS release though. Working on release notes! (And lowering my standards on
the todo list.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] more.c: More stuff, down cursor key scrolls down. Also stuff about less

2024-03-24 Thread Rob Landley
On 3/21/24 06:52, Jarno Mäkipää wrote:
> On Thu, Mar 21, 2024 at 1:08 AM Rob Landley  wrote:
>> >> > There is also a testing problem. vi.c doesn't do TEST_HOST because it 
>> >> > needs a -s option
>> >> > to pass in scripts to test with.
>> >>
>> >> Which is an issue I need to figure out how to address. What does a test 
>> >> that
>> >> only toybox passes actually prove? (That it hasn't changed since we last
>> >> looked at it?)
>> >
>> > There is vi -c which preforms a ex command which we could implement
> 
> I took -s from vim, so toybox vi could be tested comparing to vim,
> since vi itself does not have -s. And I was not interested in -c since
> ex was out of the scope of implementation at that time.

I'm not saying it's bad, I'm saying it's not sufficient. (The toysh tests have
_both_ "testcmd" and "shxpect" tests.)

Also, I'm not UPSET that someone's been making vi usable. Something is better
than nothing and I'm thankful. I'm just really annoyed at myself for not having
been able to get to it myself in a reasonable amount of time.

The vi that's there has users, and at some point I _do_ need to go through and
digest it all and wrap my head around it and take ownership of the thing, but i
haven't even managed to reboot my laptop for months to install the new devuan
version and put the 16 gig memory sticks back in, because I've been opening tabs
as fast as I've been closing them and trying to close them turns into "let me
fix this one thing real quick"... (It's like trying to pack bookshelves and
winding up reading books, which I also spent too much of last month doing.)

>> I leave vi to the people who are maintaining that vi. I got out of way for 
>> that
>> command.
>>
> 
> Well im not sure who is "maintaining" vi.c at this point, I wrote base
> implementation years ago, Elliott extended it with few commands,
> because he had some use case for it. But mostly development has been
> dormant for few years with few segfault bugfix here and there. Its not
> very pleasant experience to maintain it, since everything lead to huge
> bikeshedding, since there is no particular standard to follow,
> everyone want different things.

Indeed. I taught an "intro to unix" course at austin community college many
moons ago which had like 20 vi keys on the syllabus (half of which were new to
me, and most of which I've forgotten again). And every time I install a fresh
debian I have to go through my checklist including:

  sudo ln -sf vimrc /etc/vim/vimrc.tiny && echo export EDITOR=vi >> ~/.profile

Because going into "insert" mode and having the cursor keys crap capital letters
all over your text is stupid (this vimrc.tiny mode STILL RUNS THE SAME BIG
EXECUTABLE), and as with dash and upstart and mir and unity I suspect Mark
Shuttleworth was behind it:

  https://mstdn.jp/@landley/112119853431329313

And no I'm not typing "vim" any more than gsed, gawk, or gmake...

> Also from what I understand reading
> your postings, you have never been very satisfied on it. And that is
> understandable.

The thing is, I'm not a vi expert any more than I was a sed expert before I
wrote my own sed (twice). At some point, I have to learn enough awk to write an
awk that can replace gawk in every package build in LFS and BLFS (and hopefully
someday AOSP), and I'm not looking forward to that. I know I _need_ to, but I'm
currently overwhelmed with half-finished stuff and am trying to dig out.

I'm somewhat familiar with the subset busybox chose for its vi, although that
was always missing several things I use, so good point of reference but not a
standard. And I need to read the posix standard for vi. And then I was going to
implement some low-hanging fruit have people tell me what they missed...

>> >> I have been planning one all along, yes. The crunch_str() stuff I did was 
>> >> a
>> >> first pass at general line handling stuff that could be used by less and 
>> >> by
>> >> shell line editing and by vi and so on, but people wrote a vi that does 
>> >> not and
>> >> never will share code with the rest of those so that's off the table
>> >> permanently.
> 
> vi.c uses crunch_str from lib for utf8 handling, there was just few
> corner cases it needs to use vi only crunch_nstr, since it cant spit
> up text until nul all the time. vi.c tried to use some other
> functionality from lib also, but some of it got removed from lib and
> some functionality have probably been added way after vi.c was written
> in 2018-2020.

I tend to do passes over the whole tree from time to time cleaning stuff up and
modernizing it. (I re-review commands I hadn't seen 

Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free > FYI lsb

2024-03-23 Thread Rob Landley



On 3/22/24 18:09, scsijon wrote:
> Date: Fri, 22 Mar 2024 08:24:18 -0700
> 
>> From: enh 
>> To: Rob Landley 
>> Cc: Oliver Webb , toybox
>>  
>> Subject: Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free
>> Message-ID:
>>  
>> Content-Type: text/plain; charset="UTF-8"
>>
>> On Thu, Mar 21, 2024 at 8:45?PM Rob Landley  wrote:
>>> On 3/17/24 14:52, Oliver Webb wrote:
>>>> On Thursday, March 14th, 2024 at 12:04, enh  wrote:
>>>>> at a high level, it does seem like many/most people interpret "pending" 
>>>>> as "almost done" (he says, being part of the problem himself, having 
>>>>> several pending things building and shipping on all Android devices) 
>>>>> whereas in actual fact it can mean anything from "yeah, actually pretty 
>>>>> much done" to "will be completely rewritten" via "still just trying 
>>>>> random experiments trying to work out _how_ this should be rewritten".
>>>>> sadly i don't have a better suggestion...
>>>> pending/experimental and pending/functional maybe, or something along that 
>>>> gist?
>>> That would be my "not adding more complexity to manage transient clutter 
>>> that
>>> should instead go away" objection, already made.
>>>
>>>> Then again it'd make it harder to track the history of pending commands, 
>>>> adding only new ones
>>>> to those 2 directories would fix that, but would make the organizational 
>>>> problem for the old
>>>> ones worse.
>>> https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering
>>>
>>> Stop. No. Halt. Wait. Hold it. Woah. Cease. Desist. Caution severe tire 
>>> damage.
>>> Klatu barata nikto. Subcalifragilisticexpialidocious.
>>>
>>>>> a branch would be the usual git option, but that would probably mean "no 
>>>>> pending stuff in the main branch"
>>>> Also a problem if you want to switch Version Control systems or distribute 
>>>> tarballs without a .git/ directory.
>>> I already DID switch version control systems (from mercurial to git), and I
>>> already distribute release tarballs. Why do you think these are new issues?
>>>
>>>> It'd hide these commands too,
>>> I want to close tabs. I am not creating additional scaffolding for clutter
>>> management:
>>>
>>> $ ls -d */toys
>>> clean3/toys  clean8/toys github/toys  kl4/toys  kl9/toys  
>>> toybox/toys
>>> clean5/toys  clean.old/toys  kl10/toyskl6/toys  kleen/toys
>>> clean6/toys  clean/toys  kl2/toys kl7/toys  kl/toys
>>> clean7/toys  debian/toys kl3/toys kl8/toys  release/toys
>>>
>>> I already try not to publish quite as much clutter as accumulates locally.
>>>
>>> There's some real fossils checked into the tree. I started work on gene2fs 
>>> back
>>> under busybox, checked in what I had to the toybox repo in 055cfcbe5b05 in 
>>> 2007
>>> and haven't LOOKED at it this decade because I just haven't gotten back 
>>> around
>>> to it. Since then they INVENTED EXT4. (I still hope to get back to it, but 
>>> at
>>> the moment I'm answering email.)
>>>
>>>> For the first time I checked if there were any special branches in the 
>>>> repo because
>>>> I didn't bother to think about that in the months I spent working on it.
>>>>
>>>>> i still struggle between "other" and "lsb" in particular.
>>>> Same here, I can remember the posix commands.
>>> Can you? I still have to check some from time to time, and the definition of
>>> whether "tar" is a posix command or not is outright eldrich bordering on 
>>> quantum.
>>>
>>>> But I don't care about LSB enough to
>>>> memorize everything in wants. And keeping all completed commands that 
>>>> aren't in poisx,
>>>> lsb, networking or android
>>> The "example" directory is important because it's the only other directory 
>>> of
>>> commands that should not "default y" in defconfig. It has a policy 
>>> distinction.
>>>
>>> Back in 2012, when the number of commands was growing fast and having one 
>>> big
>>> directory of them all was getting a bit busy, the alternative of sorting 
>>> them
&g

Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-23 Thread Rob Landley
On 3/21/24 23:59, Oliver Webb wrote:
> On Thursday, March 21st, 2024 at 22:45, Rob Landley  wrote:
>> On 3/17/24 14:52, Oliver Webb wrote:
>> > Same here, I can remember the posix commands.
>> 
>> Can you? I still have to check some from time to time, and the definition of
>> whether "tar" is a posix command or not is outright eldrich bordering on 
>> quantum.
> 
> I can certainly remember them better then the LSB commands. Most of the time 
> I can
> remember if a command is in posix, which is what matters when trying to find 
> it usually.

Congratulations?

>> Collapsing the directories together when the last command is
>> promoted (or deleted) out of pending might make sense,
> 
> What would happen when a new command shows up and we need to evaluate it then?

Presumably once caught up there wouldn't usually be a dozen of them submitted
the same month, so I wouldn't fall far enough behind to need a dedicated waiting
room.

> Or glibc does a new release and yet another thing breaks we need to demote and
> re-promote eventually?

I don't de-promote commands because glibc does something stupid each new
release. That's just normal gnu/braindamage:

https://github.com/landley/toybox/issues/450
https://github.com/landley/toybox/pull/364
https://github.com/landley/toybox/issues/362

I de-promoted a command since last release because I rewrite lib/password.c in a
way that broke stuff and didn't want people poking me about it, which was me
being lazy/whelmed. Not having the option to do that is fine too, and would have
made that stay higher on the todo list. (I could also have "default n" it
without moving it, I do that locally all the time when in-progress changes break
stuff. The difference this time was I'd checked IN the stuff that broke a
command, and didn't want to revert it.)

>> I also note I think I've figured out how to replace kconfig: I can just make 
>> a
>> list that scrolls up and down with a highlighted entry you hit space on, 
>> handle
>> help text, search, exit/save, resolve selects and depends and have "menus" 
>> be a
>> label line with its contents nested two spaces further to the right.
> 
> [Some paragraphs bikeshedding about kconfig use to be here, may they rest in 
> a text file
> until we get around to doing the kconfig rewrite]

Technically a project's maintainer explaining upcoming design issues he actually
plans to implement isn't "bikeshedding".

Bikeshedding is vaguely related to the Dunning-Kruger effect, in which the
question "how hard can it be?" requiring some expertise to actually answer gets
people in trouble.

Cyril Parkinson is mostly known for Parkinson's Law (work expands to fill
available time) but he also came up with the bike shed example, where a
committee approving plans for a nuclear reactor defers to the experts enough
that at least its budget approval gets discussed quickly, but a committee
approving plans for a bike shed will argue far longer about every detail because
they think they could do it themselves and have strongly held opinions.

Everybody has an opinion on building the bike shed, and thinks their opinion is
equally valid as everyone else's with no deference to authority, experience, or
expertise. But the thing about a committee approving plans is they STARTED with
a viable plan for the thing, which they then ignore because they know better.

If you feel like I'm "bikeshedding" about a kconfig replacement when I was
involved in https://lkml.indiana.edu/hypermail/linux/kernel/0202.1/2037.html and
argued at length with Roman Zippel about https://lwn.net/Articles/160497/ and
dug rather a lot through busybox's fork of it back around
https://git.busybox.net/busybox/log/scripts/config?id=7a43bd07e64e and already
implemented scroll up/down/left right list logic like I'm describing in the
"top" command... I think we have a different definition of the term.

>> > A possible solution is to...
>> 
>> ...
>> 
>> > Then again...
>> 
>> I need to stop checking email every time I sit down at my laptop, because
>> bikeshedding can eat an endless amount of time and I've got other stuff to 
>> do.
>> 
>> For one thing, I promised to look at
>> https://github.com/landley/toybox/issues/486 tonight.
> 
> Sorry for getting in the way of that, the technical discussion about it was
> interesting enough to me to respond to. Recently found something to run off to
> and do while still benefiting toybox, so I'll stop bikeshedding about stuff 
> like this.

I'm complaining about my own insufficient time management skills, I'm not trying
to discourage people from taking an interest in the project.

I do find "why is it like this" easier to deal with than "l

Re: [Toybox] test.sh: Don't override "C" command path in TEST_HOST if it's set

2024-03-22 Thread Rob Landley
On 3/22/24 16:11, Rob Landley wrote:
> On 3/21/24 21:38, Oliver Webb via Toybox wrote:
>> A mildly annoying issue of you are trying to test with different 
>> implementations of commands
>> such as plan9 ones or sbase or busybox ones, things with different 
>> conflicting implementations 
>> of things like xxd or vi. With this patch you can do "make test_cmd 
>> TEST_HOST=1 C=/path/to/other/cmd"
>> and have it work
> 
> I've been doing "PATH=/path/to/thingy:$PATH TEST_HOST=1 make test_cmd" for
> years, I didn't know that needed to be documented...

P.S. The point of C= being a path is otherwise shell builtins tend to get called
(so you're not necessarily testing what you think you are), and last I checked I
hadn't found a portable mechanism for disabling a specific shell builtin other
than providing a path to the command to run. (If you disable _all_ shell
builtins the test script could break due to missing commands on some systems.)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] test.sh: Don't override "C" command path in TEST_HOST if it's set

2024-03-22 Thread Rob Landley
On 3/21/24 21:38, Oliver Webb via Toybox wrote:
> A mildly annoying issue of you are trying to test with different 
> implementations of commands
> such as plan9 ones or sbase or busybox ones, things with different 
> conflicting implementations 
> of things like xxd or vi. With this patch you can do "make test_cmd 
> TEST_HOST=1 C=/path/to/other/cmd"
> and have it work

I've been doing "PATH=/path/to/thingy:$PATH TEST_HOST=1 make test_cmd" for
years, I didn't know that needed to be documented...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] toysh: fix -Wuse-after-free

2024-03-21 Thread Rob Landley
On 3/17/24 14:52, Oliver Webb wrote:
> On Thursday, March 14th, 2024 at 12:04, enh  wrote:
>> at a high level, it does seem like many/most people interpret "pending" as 
>> "almost done" (he says, being part of the problem himself, having several 
>> pending things building and shipping on all Android devices) whereas in 
>> actual fact it can mean anything from "yeah, actually pretty much done" to 
>> "will be completely rewritten" via "still just trying random experiments 
>> trying to work out _how_ this should be rewritten".
>> sadly i don't have a better suggestion...
> 
> pending/experimental and pending/functional maybe, or something along that 
> gist?

That would be my "not adding more complexity to manage transient clutter that
should instead go away" objection, already made.

> Then again it'd make it harder to track the history of pending commands, 
> adding only new ones
> to those 2 directories would fix that, but would make the organizational 
> problem for the old
> ones worse.

https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering

Stop. No. Halt. Wait. Hold it. Woah. Cease. Desist. Caution severe tire damage.
Klatu barata nikto. Subcalifragilisticexpialidocious.

>> a branch would be the usual git option, but that would probably mean "no 
>> pending stuff in the main branch"
> 
> Also a problem if you want to switch Version Control systems or distribute 
> tarballs without a .git/ directory.

I already DID switch version control systems (from mercurial to git), and I
already distribute release tarballs. Why do you think these are new issues?

> It'd hide these commands too,

I want to close tabs. I am not creating additional scaffolding for clutter
management:

$ ls -d */toys
clean3/toys  clean8/toys github/toys  kl4/toys  kl9/toys  toybox/toys
clean5/toys  clean.old/toys  kl10/toyskl6/toys  kleen/toys
clean6/toys  clean/toys  kl2/toys kl7/toys  kl/toys
clean7/toys  debian/toys kl3/toys kl8/toys  release/toys

I already try not to publish quite as much clutter as accumulates locally.

There's some real fossils checked into the tree. I started work on gene2fs back
under busybox, checked in what I had to the toybox repo in 055cfcbe5b05 in 2007
and haven't LOOKED at it this decade because I just haven't gotten back around
to it. Since then they INVENTED EXT4. (I still hope to get back to it, but at
the moment I'm answering email.)

> For the first time I checked if there were any special branches in the repo 
> because
> I didn't bother to think about that in the months I spent working on it. 
> 
>> i still struggle between "other" and "lsb" in particular.
> 
> Same here, I can remember the posix commands.

Can you? I still have to check some from time to time, and the definition of
whether "tar" is a posix command or not is outright eldrich bordering on 
quantum.

> But I don't care about LSB enough to
> memorize everything in wants. And keeping all completed commands that aren't 
> in poisx,
> lsb, networking or android

The "example" directory is important because it's the only other directory of
commands that should not "default y" in defconfig. It has a policy distinction.

Back in 2012, when the number of commands was growing fast and having one big
directory of them all was getting a bit busy, the alternative of sorting them
into directories was annotating them with tags, and THAT was a nightmare (of the
"this command has three tags" variety). And also implied future pressure to
extend the existing kconfig implementation to USE the tags, which would be 
worse.

Moving them into subdirectories, with each command in ONE directory, and a
README explaining what the directory was for, with kconfig automatically
displaying them in menus and using the first line of the README as the menu's
title, seemed the least bad crowd control option at the time.

> in a massive "other" folder sorta defeats
> the purpose of these directories which are supposed to reduce clutter.

It wasn't really about reducing clutter. I mean yeas, back then some web viewers
wouldn't display more than 250 files in a directory, the way github truncates at
1000 today:

https://github.com/landley/linux/tree/master/arch/arm/boot/dts

But the goal was annotating command categories. Posix and pending are obvious,
and I mentioned example. Back when I split them up, LSB was still a viable
standard (the Linux Foundation hadn't destroyed it yet), and it STILL kind of
means "this command existed back in Y2K and was considered part of the base
system back then, even if posix never caught up". Several commands in pending
get promoted into LSB (such as most of the password stuff, although oddly
mkpasswd is NOT in lsb 4.1).

Hmmm, possibly instead of a dead standard the linux foundation killed, I should
instead check the $PATH of my old red hat 9 install from the dawn of time...
Hah, it's still on busybox's website:
https://busybox.net/downloads/qemu/rh-9-shrike.img.bz2 Login as user 

Re: [Toybox] [PATCH] toysh: Shut up TEST_HOST, correct 3 test cases

2024-03-21 Thread Rob Landley


On 3/17/24 10:23, Oliver Webb wrote:
> On Fri, Mar 8, 2024 at 19:46, Rob Landley mailto:On Fri, 
> Mar
> 8, 2024 at 19:46, Rob Landley <> wrote:
>> On 3/7/24 19:39, Oliver Webb via Toybox wrote:
>> > Looking at toysh again since the toybox test suite should run under it
>> > (in mkroot or under a chroot) A problem seems to be that there is no
>> > return command, which breaks runtest.sh to it's core. Dont know how to add
>> one in yet
>> >
>> > On my version of bash (5.2.26) TEST_HOST fails on 3 test cases,
>> > and toysh also fails on those cases (Even tho toysh is doing the right
>> > thing, the same as bash) The attached patch changes the test file
>> > so that 3 test cases are resolved. And TEST_HOST works
>>
>> Because Chet changed stuff I asked him about, making bash a moving target.
> This does bring up the question of what to do with specific edge cases. Since
> bash can’t even be consistent with itself, most bash scripts don’t rely on 
> them,
> at least the ones I’ve seen.
> 
> Should we set out to implement every specific edge case, and if so what 
> version
> are we confirming with? Or should we pick what’s most sensible/easiest to deal
> with and toyonly the test cases for them.

I've been studying the problem space since 2006, have read the bash manual all
the way through more than once, read some subset of the 'advance bash scripting
guide", and was basically making judgement calls. then Elliott got me talking
directly to the bash mainintainer, which from my perspective made a lot of those
corner cases a moving target when they weren't before.

In fact my FIRST pass at this was matching the bash 2.04b behavior from like
1999 that I used in aboriginal linux, until gentoo's portage scripts needed
newer bash features, specifically ~= and some quoting corner case behavior...

"What should all those judgement calls be ahead of time, I demand preemptive
policy" does not personally strike me as helpful. I was mostly trying to
implement what seemed good to me (which still involves asking a LOT of questions
and turning them into test cases to see what bash's behavior actually IS), then
run the Linux From Scratch and Beyond Linux From Scratch package builds through
it to see what broke, then wait for people to complain and take it on a case by
case basis.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] more.c: More stuff, down cursor key scrolls down. Also stuff about less

2024-03-21 Thread Rob Landley
On 3/21/24 16:13, Oliver Webb wrote:
> On Thursday, March 21st, 2024 at 15:53, Rob Landley  wrote:
> 
>> I note that "more" is from the days of daisy wheel teletypes, and was thus
>> designed to work ok without a tty or interaction through cursor keys (you can
>> export $COLUMNS and $LINES or just let it guess 80x25), and "less" requires a
>> tty and cursor keys. This might make "more" a better fit for on-screen 
>> keyboards
>> that don't provide cursor keys. (Or not...)
> 
> less supports vi keys (hjkl), and all the keybindings of more. less doesn't 
> require
> cursor keys in the same way vi doesn't, it's just how it's more commonly used.

Piping data through more doesn't allocate memory. Piping data through less
continues to allocate memory as data is accumulated. I don't know if there's a
backscroll limit, so I don't know if there's a limit on the amount of memory it
allocates.

>> I would like to have one implementation sharing code. Implementing "less -R"
>> cuts the behavior delta between the two, and having an option to let ctrl-c 
>> exit
>> less (instead of just killing the rest of the pipeline) probably gets us 
>> close
>> enough we to handwave the rest?
> 
> There is less -K and less -E (exit on C-c and exit at EOF respectively),

Good to know.

> so more_main would look something like:
> 
> void more_main(void)
> {
>   toys.optflags |= FLAG_E|FLAG_K|FLAG_R;
>   less_main();
> }
> 
> Once we have a good enough less.

We could implement a big thing and have it pretend to be a small simple thing, 
yes.

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] more.c: More stuff, down cursor key scrolls down. Also stuff about less

2024-03-21 Thread Rob Landley
On 3/20/24 11:47, enh wrote:
> On Wed, Mar 20, 2024 at 9:38 AM Rob Landley  <mailto:r...@landley.net>> wrote:
> 
> On 3/20/24 00:02, Oliver Webb via Toybox wrote:
> > I spotted the more implementation in pending. Looking at it, it's 
> missing
> quite a lot of stuff,
> > Such as the ability to go back in a file.
> 
> More never had the ability to go backwards, less did. Different command.
> 
> 
> (...but there's a lot of confusion because many modern systems have more just 
> a
> symlink to less.)

Ooh, there's a fun edge case.

A failure mode of busybox is what if you symlink an unknown name to an existing
command, busybox says the unknown name is an unknown command. But in toybox, if
it doesn't recognize the name toybox_main loops resolving symlinks until it runs
out of them or hits a recognized name:

  // fast path: try to exec immediately.
  // (Leave toys.which null to disable suid return logic.)
  // Try dereferencing symlinks until we hit a recognized name
  while (s) {
char *ss = basename(s);
struct toy_list *tl = toy_find(ss);

if (tl==toy_list && s!=toys.argv[1]) unknown(ss);
toy_exec_which(tl, toys.argv+1);
s = (0 less -> toybox would act
like more, not like less. (Unless you configured more out but left less in, then
it should behave like less.)

I note that "more" is from the days of daisy wheel teletypes, and was thus
designed to work ok without a tty or interaction through cursor keys (you can
export $COLUMNS and $LINES or just let it guess 80x25), and "less" requires a
tty and cursor keys. This might make "more" a better fit for on-screen keyboards
that don't provide cursor keys. (Or not...)

I would _like_ to have one implementation sharing code. Implementing "less -R"
cuts the behavior delta between the two, and having an option to let ctrl-c exit
less (instead of just killing the rest of the pipeline) probably gets us close
enough we to handwave the rest?

I need to genericize my watch.c code to share the cursor tracking with less.
Possibly keep a scrollback buffer. Except there's still some extension because
watch.c doesn't let you cursor left and right...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] mount: avoid deferencing NULL.

2024-03-20 Thread Rob Landley
On 3/20/24 16:07, enh via Toybox wrote:
> I don't know why I wasn't seeing this yesterday

Because /sys was mounted, so readfile() returned a string with its contents.
(And/or race condition of the mount going away between reading /proc/mounts and
asking for follow-up data about a specific mount point from sysfs.)

Sigh, I initialized ss to "" so I could just printf("%s", ss) without testing,
but readfile() returns NULL when the file doesn't exist and I overwrite it in
place because I didn't want to juggle through a THIRD variable (mostly because
I'm out of convenient names for them), and I missed an else setting it BACK to
"" in the NULL case.

Adding the one test doesn't fix printf() calling null, which segfaults on some
libcs. Lemme put the else in...

The real design failure here is that if the readfile() returns an empty string
we won't free it, but that should never happen, the amount of memory leaked
would be trivial and the command exits at the end of the list.

Hmmm... well, I COULD move the s = xabspath(mm->device, 0) down to the end of
the if (*s == '/') and then use THAT as my third variable...

Ok, I rewrote the code to use three varaibles and thus leave the "" in ss when
it doesn't have reason to change it. (Single Point of truth, setting it BACK to
"" and thus having two "" constants was icky. Yeah, tiny flaw but _I_ saw it.)

Commit d298747580c7 and once again I've only tested the "file exists" path, I'm
not unmounting sysfs on my work laptop and haven't got a convenient test vm I
can loopback mount a filesystem image in at the moment. (The devuan install iso
image I've been using is a bit big to stick in initramfs...)

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] chattr.test: awk -> cut so mkroot can run it

2024-03-20 Thread Rob Landley
On 3/20/24 15:20, Oliver Webb via Toybox wrote:
> Patch does what it says on the tin. First thing I caught while doing a test 
> of all commands
> in mkroot chattr fails all tests on my system (A ton of "Operation not 
> permitted" errors, 
> on ext4), but the failures are consistent with TEST_HOST so I guess chattr 
> doing what it's
> supposed to? (Yes, I ran it as root) The .test file will need a rewrite 
> eventually but right
> now I'm just trying to get all tests to run under mkroot

Elliott keeps sending me patches to remove bashisms from the test suite so it
works under mksh, which I was intentionally leaving in because I intend to
implement that before 1.0 and wanted to dogfood it. I have a file of the ones
that were removed so I can put them BACK at some point...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [RFC] mkroot: Possible solution to running tests in a vacum: Use the host bash in a chroot

2024-03-20 Thread Rob Landley
On 3/20/24 12:38, Oliver Webb via Toybox wrote:
> A target for the 0.9 release is the test suite running under mkroot,

On all the architectures mkroot supports (endianness, word size, kernel
version), under qemu with a known kernel environment so we can test things like
insmod with known modules, or test ifconfig and hostname without destabilizing
my development laptop.

> Which is also required
> for passwd to be re-promoted (We need to test it in a vacuum).

Eh, I can test that manually for one release. My problem is I keep getting
distracted by tangents. The "create changelog" todo item made it as far as
commit 40e73a387329 which has a pending TODO item I tried to fix (refill toybuf
to try to span EXIF data when file is identifying JPEG files), I need to instead
WRITE THAT DOWN (and leave it unfixed for now) and continue to the end of the
list so I _have_ a current changelog and CAN cut a release... but haven't yet.

> The main downside of this is that you have to look for the dynamic libraries 
> bash wants and
> copy them into the fs directory,

The code I wrote to do that way back when was something like:

  https://git.busybox.net/busybox/commit/?id=3a324754f88b

I.E. recursively call ldd to see what its dependencies are, repeat until you run
out of dependencies.

I _can_ make this work. It's just not the direction I wanted to go in.

 and doing a chroot requires root permissions. Also it is very
> clearly not a permanent fix (None of this is needed once toysh is ready), 
> just enough to get
> tests for commands like passwd and chsh running. Another downside of 
> chroot-ing is you can't
> emulate things that depend on drivers or nommu.
> 
> Attached is a mkroot package (Not a patch), that sets up a environment to run 
> the test suite
> under a chroot in. (./mkroot/mkroot.sh testwhost && sudo chroot root/host/fs 
> /test command_name).
> It's not something I'm actually expecting to be merged, but that doesn't mean 
> it's not potentially
> useful for testing the commands that modify /etc/passwd and friends.

I was setting up a debootstrap to test it under, since that's presumably
isolated enough, but last time I sat down to poke at that I got distracted into
the Orange Pi 3b server setup which is the _other_ consumer of a debootstrap I
have lying around, and then I went "too much for now but I can at least do the
testing under a qemu-system-arm64 with devuan arm64 debootstrap" and hit the
fact that trying to marshall a tarball into mkroot using "wget | tar xpv" spat
out endless unexpected EOF files because the tarball autodetection logic had a
regression and the child process thinks it's the parent process.

Still have a tab open for that, trying to dig back down to fix it, been
distracted by external pokes instead.

> Also when making this I spotted some things in the build infrastructure we 
> will need to work around
> in a airlock-ed test suite, test.sh needs configure,

Only for single builds, not for testing all of toybox.

And presumably I should add a TEST_EXISTING=1 to skip the single build and just
grab the command out of the current directory and/or $PATH. (There's always more
work to do on the test suite...)

> and portability.sh needs something for CC or
> else it will throw a fit.

I know, one of my trees has a partial patch for it, but there's some design 
work...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


  1   2   3   4   5   6   7   8   9   10   >