Two weeks since the release and Wikipedia[citation needed] hasn't noticed yet, which is fine and normal (I just don't want them to be actively _wrong_), but their "Project progress" section explains that "in 2015"... which was 5 years ago now? And they still link to http://www.landley.net/toybox/todo.txt from 2011 which is PURELY HISTORICAL...
Right. I can at least do a current 2020 analysis. Pulling up https://landley.net/toybox/status.html which I updated for 0.8.3, here are the lists. Of the 343 proposed commands it lists: --- completed (204 commands) acpi arch ascii base64 basename blkid blockdev bunzip2 bzcat cal cat catv chattr chgrp chmod chown chroot chrt chvt cksum clear cmp comm count cp cpio crc32 cut date devmem df dirname dmesg dnsdomainname dos2unix du echo egrep eject env expand factor fallocate false fgrep file find flock fmt free freeramdisk fsfreeze fstype fsync ftpget ftpput getconf grep groups gunzip halt head help hexedit hostname hwclock i2cdetect i2cdump i2cget i2cset iconv id ifconfig inotifyd insmod install ionice iorenice iotop kill killall killall5 link ln logger login logname losetup ls lsattr lsmod lspci lsusb makedevs mcookie md5sum microcom mix mkdir mkfifo mknod mkpasswd mkswap mktemp modinfo mount mountpoint mv nbd-client nc netcat netstat nice nl nohup nproc nsenter od oneit partprobe passwd paste patch pgrep pidof ping ping6 pivot_root pkill pmap poweroff printenv printf prlimit ps pwd pwdx readahead readlink realpath reboot renice reset rev rfkill rm rmdir rmmod sed seq setfattr setsid sha1sum shred sleep sntp sort split stat strings su swapoff swapon switch_root sync sysctl tac tail tar taskset tee test time timeout top touch true truncate tty tunctl ulimit umount uname uniq unix2dos unlink unshare uptime usleep uudecode uuencode uuidgen vconfig vmstat w watch wc which who whoami xargs xxd yes zcat --- pending (68 commands) addgroup adduser arp arping bash bc bootchartd brctl cd crond crontab dd deallocvt delgroup deluser dhcp dhcp6 dhcpd diff dumpleases exit expr fdisk fold fsck getfattr getty groupadd groupdel host init ip ipaddr ipcrm ipcs iplink iproute iprule iptunnel klogd last lsof man mdev mke2fs modprobe more openvt route sh stty sulogin syslogd tcpsvd telnet telnetd tftp tftpd toysh tr traceroute traceroute6 udpsvd useradd userdel vi wget xzcat --- todo (71 commands) ar at awk chfn chsh cols compress csplit diff3 dig dir dosfslabel ed fsck.ext2 fsck.vfat ftpd fuser genext2fs getevent groupmod gzip hexdump hostid ipconfig iwconfig iwlist join kexec kinit less mkfs.vfat newfs_msdos newgrp nfsmount ntpd pathchk pinky rdate resize2fs resume rpm2cpio rsync runcon sdiff sendmail sfdisk sha224sum sha256sum sha384sum sha3sum sha512sum shutdown stdbuf sudo sum tabs tput tracepath tune2fs unexpand unzip usermod users vdir zcmp zdiff zegrep zfgrep zip zless zmore --- which means So a first guess would be more like 70% done, because if you just take 204 plus half of 68, divided by 343 you get just under 70%. But that's not right for a BUNCH of reasons. THe first of which is those three lists aren't quite everything, when I run "make distclean defconfig toybox; scripts/mkstatus.py" it also says: uncategorized: blkdiscard rtcwake getopt readelf eval exec export shift unset -sh -toysh -bash But the uncategorized stuff actually means I need to add a few things I've already DONE to the roadmap, but half of it's shell builtins (eval, exec, esport, shift, unset, -sh, -toysh, -bash) that don't actually count as separate commands. blkdiscard and rtcwake I already promoted, they'd go in the "done" category if they were listed in the roadmap so the script could find them. Readelf is in pending but isn't hard, it's just "spend an hour reading 600 lines closely" and probably staring at specs and test files. I've mostly been waiting because it's a recent addition and Elliott was still poking at it. That leaves getopt, which is legitimately stuck in pending due to being a design-level nightmare: the one that's there works fine but pulls in a whole second set of option parsing logic from lib.c that nothing else uses. Do I want to accept that or try to adapt it to use lib/args.c which may change the resulting semantics in ways that require a close reading of the spec? (Note that I've never used this command because it sucks, and the shell's builtin "getopts" is an UNRELATED command, and collectively I'd really rather not. But alas, it's in posix, and somebody's script is using it...) --- mkroot Remember my whole "self-hosting build" goal? Well I integrated a system builder into toybox, which Wikipedia[citation needed] will probably never notice. There are only two more commands that needs missing from toybox defconfig to create a usable basic standalone system: "sh" and "route", both of which are being worked on. The sh that's there is already semi-usable, I'd guess it's 2/3 done maybe? (Hard to tell, it's difficult to scope work you haven't done yet.) Route is still in pending because I'm ambitious and want one that does multiple routing tables, which the existing implementation wasn't designed for (but neither was debian's; I want to do _better_ than net-tools). As for creating a self-hosting build environment, the list I have written down for building the old Linux From Scratch version I was testing with was (according to scripts/install.sh): PENDING="dd diff expr ftpd less tr vi wget awk sh sha512sum sha256sum unxz xzcat bc bison flex make nm ar gzip" The commands "diff, ftpd, less, vi" are there because humans log into a build system to debug stuff sometimes, and it's nice to have basic amenities. Not STRICTLY required (you can use network mounts instead of ftpd, it's just my old build system used ftpd), but eh. There's been quite a lot of work on vi in pending, and there's a diff there I'm told works. I did "watch" already and that's the basic plumbing "less" uses, to be honest that one's held up by "more" not sharing code with it (and quite possibly a design/conceptual level, need to frown at it some more). diff sucks in diff3 and sdiff and my main complaint is I wanted to use braham cohen's patience algorithm. :P The commands "bc, bison, flex" are only there because modern linux kernel development has gone off the deep end into crazytown. Grrr. There's a very large bc in pending I need to slim down, and I've never used yacc/lex for anything and need to learn to in order to write replacements. gzip: I have the start of a deflate implementation, I need to get back to it. I was trying to be binary identical with what other gzips produced which hit the "when do you flush the dictionary" question, which doesn't seem to have an official answer. Test the debian output and match that, I suppose. ar would be trivial if it wasn't for -s mode, the format of which is completely undocumented. (Not hard, just annoying.) nm is basically "readelf with a different output format", it's pending on the readelf cleanup. (Which is half done, I just got distracted from it.) I've started to clean up dd something like 5 times and gotten distracted, it's not _hard_ it's just long and not something I want to start again without a block of uninterrupted time laid out for it. (At least a week, it's constructed entirely out of corner cases.) Last time I tried to clean up expr I hit the problem that posix had suffered a regression (its html renderer lost grouping information), which is long since fixed I just never got back to it... It would be trivial to cleanup tr except I want to teach it utf8 support, which makes everybody who knows about this scream in pain, AND YET... awk is probably the largest remaining can of worms I haven't opened. Like most people I only use it to "cherry pick the 4th word from this list", but I need to implement the full language. The busybox one was 2800 lines when I was maintaining it. (It's probably longer now.) sha256sum and sha512sum are tangled into the "sha-3" todo item (see also sha224sum sha384sum and sha3sum). I did my own md5sum and sha1sum implementations way back when (which are merged and share code) and I want to see if I can fit the rest in there, but haven't sat down to really focus on it yet. (It's mathy, there's research. Like the compression algorithms. I do NOT wanna be distracted from chewing on it halfway through, so big chunka time.) unxz and xzcat: see "mathy", above. I found public domain implementations of the decompressor way back, glued them together, and stuck them in pending. I should make sure there haven't been security fixes or obvious format changes, and then do to that what I did to bunzip2 ages ago (https://git.busybox.net/busybox/commit/?id=0d6d88a2058d). make is a post-1.0 todo item, and really properly belongs in qcc. --- the rest of the pending/todo items As for the remaining pending and todo commands, a chunk of them are shell aliases and shell builtins that are part of the "toysh" todo item: bash sh toysh cd exit The "user management" commands (addgroup adduser delgroup deluser groupadd groupdel useradd userdel chfn chsh groupmod newgrp usermod) have been low priority because android doesn't use normal user accounts and never will (each app is installed as a different UID, that's a legacy decision from before containers happened). I'd like to convince android to create a "posix container" within which you can have multiple users and run binaries you build, but that's an ongoing discussion. None of this is hard to clean up and promote, I just haven't bothered because I didn't have an immediate use case. Now that mkroot's there using a simple /etc/passwd I should cycle back to them. bootchard was an external submission: I've never used it. It's not hard to clean up, I just have to learn how to use it in order to _test_ it... I've never been a big user of cron, so "crond crontab" are more "easy to clean up, but I don't have tests for them". I've used "at" before, but I think it hooks into crond somehow? (There _is_ a server component...) Really I just haven't looked at those yet. I haven't prioritized "deallocvt openvt" because VGA hardware virtual terminals aren't really a thing anymore, but I should get them out of the way. (It's _just_ rebooty enough that I've been reluctant to try it on my laptop for fear of it doing a wobbly and me having to reboot to get my screen back, losing all my open windows on 8 desktops, but... gotta bite the bullet sometime. It's not as scary as testing rm -rf for the first time... Ok, done, and email fired off to see if the original submitter can test it.) People submitted "tcpsvd udpsvd" (which I don't use) after I already implemented netcat -l (which I do use, and it's got a UDP mode because of google guys wanting that, possibly for netconsole) and I'm going "this should be merged somehow"? ip ipaddr ipcrm ipcs iplink iproute iprule iptunnel: I don't use the "ip" command, and its existence annoys me. Refusing to update ifconfig and route to new APIs and instead throwing them out and replacing them with a giant hairball that works like git with subcommands is sad. I'm updating the original commands to do the right thing, and implementing standalone versions of anything that ONLY this does. I do not consider these part of the 1.0 release, when I'm done they should be aliases for the standalone commands (because other people prefer that UI due to familiarity). This is there now because it was an external contribution and I didn't want to stand in the way. I started cleaning up "man" once and other people were changing it while I was changing it, so I deleted my changes and merged theirs, and haven't looked at it again since. Possibly it's quiet enough now, but I haven't cycled back to it. There's a pile of network stuff (arp arping brctl dig host traceroute traceroute6) that's not hard, I just haven't needed it yet. It's all behind "route" on the todo list, which is the next one I _do_ need. I note that traceroute/traceroute6/tracepath are elaborate variants of "ping", and host/dig are the same command with different UI and output. There's also network client/server stuff (telnet telnetd tftp tftpd wget httpd dhcp dhcp6 dhcpd dumpleases) which is more "elaborate but not hard". More time consuming than difficult. (Modulo I need to test ph7 with httpd, and that and wget need https integration via the command line stuff, see also http://lists.landley.net/pipermail/toybox-landley.net/2017-September/009158.html and http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004865.html which means I need to install bearssl into mkroot for testing which sounds like SO much fun... Sigh. But that's foisting a difficult bit off onto an external program. It's a pity dropbear never provided https, but I can see wanting to avoid the key management part of that.) The group "mkfs.vfat, newfs_msdos, dosfslabel, fsck.vfat" I've done some work on, also I also want genfatfs and mtools support, but am not adding them to the roadmap just now. Got distracted, haven't gotten back to it. The transitions from fat12, fat16, and fat32 are kind of magic/evil but other than that it's pretty straightfward... Similarly, I did about the first half of mkext2fs and genext2fs back in the day (like 2006) and got REALLY distracted by that whole "leaving busybox" thing, still haven't gotten back to it and in the mean time ext4 happened which I do _not_ understand. That impacts fsck.ext2 (what exactly can go _wrong_?) and then there's tune2fs (trivial once the others are in) and resize2fs (kind of an fsck variant almost). It's all a group, if I _just_ want ext3 I can probably do it in a couple months? If I wasn't doing anything else at the time... fsck itself is just a wrapper around the filesystem-specific fsck commands, I've got one in pending but until there are other fscks to test it with... (I mean I _can_ clean it up.) fdisk grew a new format when disks hit 2TB: MBR I understood GPT I do not. It's on the todo list. I sort of want sfdisk at the same time (scriptable!) but sfdisk turns out to be really ugly from a UI standpoint and conventional fdisk is scriptable via "echo | fdisk" so... screw sendmail, why is it on this list? (Because some other package had it. I'm not doing it, that whole ecosystem is crypto all the way down now.) And compress is obsolete (due to historical patents, it got killed by gzip), I don't care what posix says. The .Z file format is like supporting arj. iwconfig and iwlist are sort of an ecosystem with wpa_supplicant, I'd declare it out of scope except the hardware it controls it's pretty widely used these days. I need a wrapper script to associate my laptop with an access point from the command line, which I've never quite managed to do on _debian_, so... (Because access keys. There's a tool to turn a wpa_passphrase into what the hardware consumes and it's crotchety and all handled by magic GUI wrappers people make that work VERY HARD to hide the details. I know the theory but am always missing a corner case somewhere. Closest I got was trying to associate with my phone at UT when I first got this laptop, ala https://landley.net/notes-2019.html#17-04-2019 and it did not fill me with confidence...) lsof and fuser are similar and tricky. They should share code, the PROBLEM is lsof's command line is horrific and has 8 gazillion weird little corner cases and mostly I don't care but the one that was submitted doesn't support +D which is one of the most useful things it does (recursively show all open files under directory), so... (Plus merging the ipv4 and ipv6 plumbing in its internet listing. Again, me being picky and insisting on things I don't TECHNICALLY need to do...) hexdump is trivial to do, except I've already got od and hexedit and they're not sharing code and am reluctant to ADD A THIRD. Except "hd blah" is the one I personally use, so... :) ed is vi, only less so. I really don't want it, but apprently implementing vi makes it not that bad? Or something? (I implemented _sed_ which should be able to do everything you need from ed, but no. There are some serious geezers out there who insist.) sudo is easy to implement and hard to prove correct. :P rsync isn't actually that hard (if you only implement -e ssh and ignore that server nonsense) and is well documented (https://rsync.samba.org/how-rsync-works.html). They even migrated from md4 to md5 (for the look of it, it never had a security implication because it wasn't used for that) so I've already _got_ most of the plumbing. I just haven't sat down to do it yet. ntpd shouldn't be in there, I implemented sntp and that covers it. (Removed.) I'm not sure if that covers rdate or not (I _can_ do an rdate, but when I put that in there I thought that qemu had an rdate server built in, and it turned out it was passing through 10.0.2.2 to inetd on the host that had it built in, and that was a previous distro and devuan hasn't got it. Hmmm...) pinky is a trivial finger, it's an afternoon's work sometime. Hard part's finding a finger server to test it with. :) getfattr: android sent me one, it works, I don't use xattrs. rpm2cpio is trivial but I'm not sure it's the right approach (there was better rpm and deb support in busybox 15 years ago, and the trick was instead of "database" just "directory where the header with the metadata from each installed package was copied to, under the original package's name", which worked surprisingly well. Listing installed packages was just an ls of the directory, for one thing.) Can of worms I have yet to open... stty: I need to clean this up. It's fiddly. Not hard, just elaborate and has no tests. :( modprobe: I don't use modules much myself. I know the theory, and insmod's already promoted... kexec: simple, requires rebooting to test, I was holding off until I had qemu builds going and now that mkroot's in I should circle back around to this. (Modulo qemu doesn't always reboot cleanly with the -kernel command line argument used as a builtin bootloader. I THINK kexec won't care because it doesn't go back through the bios?) kinit/init: some horrible klibc thing that was half-assing another command, I just need a proper init. (now THAT is a can of worms, I need to dig up my notes from https://landley.net/notes-2015.html#03-06-2015 which are at http://landley.net/systemd-notes.txt and make sense of them...) A "shutdown" command is also part of init. (As is resume, more or less.) getty and sulogin is more early system bringup stuff my systems don't use. I know how to do it, just haven't had reason to. "users" is kind of in that bucket too (list logged in users. From the minicomputer days when we all shared computers!) And "last". But klogd and syslogd are useful-ish (I mostly just dmesg myself which the kernel does for you), but now that I have mkroot I really _should_ get to these. (Android has its own and won't care, of course...) mdev is a thing I created way back when, which other people added a lot of stuff to later because they were using it, but then devtmpfs was invented and half its reason for being went away, but if you collapse hotplug into it I guess it's still useful? And there's also "notify and take action when a device shows up", which is still useful. Needs design work to figure out what it should look like now, starting with researching other people's use patterns of it. dir and vdir are just ls with a different output format, haven't bothered because I last used dos in something like 1992. On the one hand, it looks trivial (it's ls flag mapping), on the other... I've never used either one and would have to read the man page closely to figure out WHAT flag mappings this is? (And then there's the question of why either SHOULD be there? We have a perfectly good ls. Both come from tizen, which I haven't heard from in years, and although both are in coreutils... why?) Come to think of it, dir and vdir should probably be shell "alias" of ls... I have a hostid already, it's in toys/example because a 32 bit identifier is no longer globally unique, so the command's reason for being stopped working. But that's done, it's just in the pending list because it's not in defconfig because "example" isn't part of defconfig either. nfsmount is a trivialish wrapper around mount, it supplies the password so you don't have to -o it on the command line (making it visible to other users on the machine while mount runs, of which there should be none). There's an smbmount too. I haven't gotten to it because I don't use nfs or samba (although I keep meaning to write a samba server in toybox). sum is obsolete, I don't care what posix says. It predates sha1sum, md5sum, and for that matter crc32. Not doing it, it's in the list because some other package toybox has in the "when we do all these we replace that" had it and I had it in the "maybe" list for that package. I should take it out... it was in sash. unexpand "converts spaces to tabs". Haven't gotten around to it yet. :) tabs sets tabstops on a terminal. Is that even still supported? Oh goddess, yes it is. That's horrifying. Leave it there for now, figure out what to do about it later. (I mean, it looks like it's just a wrapper for an ioctl, but _ew_.) tput does things with the terminfo database that were last relevant in 1976, because we had phyiscal teletypes, then glass ttys with hardcoded behavior and buying an IBM TN3270 vs a DEC VT100 made a difference. And NOBODY HAS DONE THAT FOR 40 YEARS! These devices NO LONGER EXIST (outside of museums), we can STOP EMULATING THEM ALREADY. Windows never did. The mac doesn't. You have a terminal with a monospaced font in it and can use bog standard ANSI escape codes (as implemented in DOS ansi.sys in 1986) and these days EVERYTHING DOES UTF8 ENCODED UNICODE. (Or should.) But maybe some scripts say "tput clear" instead of "clear" or "reset"? Then again, those scripts should be easy to adjust shouldn't they? Sigh, ok "tput cup X Y" becomes the ansi escape code to jump to that location, fine... wait, they do Y before X? Why? But yeah, I can do a really simple version of this for some common cases. But HONESTLY. Yeesh. Why is that still in posix? fold, cols, csplit, and join are text manipulation commands that take rows of lines and do things to them. There's a fold in pending, the rest aren't hard. (Legacy of unix's early history as a typesetting system for the AT&T patent and licensing department, I expect....) No, "cols" is from something called "suckless" which contradicts itself in the name, made up lots of random new crap that didn't take off, and which I last looked at in November 2015. Yanking that... I've looked at pathchk before and gone "I'm not sure I agree with the premise". It's in posix, and devuan has it in the standard install, but... why? Define "portable"? Linux accepts 255 character path components with any char but NUL and / in them, period. And hasn't got a max length limit otherwise. Portable to _what_? Why? No, I don't want this one. stdbuf is sort of a wrapper that intercepts a command's stdin/stdout and does reads/writes of different sizes at it. I keep meaning to give it a closer look to see if it's worth bothering. runcon is also already implemented. It's more selinux nonsense and I haven't got the selinux libraries installed on my system so the compile time probes make it drop out, that's why it's in the list. (It's not in _my_ defconfig, but that's a false negative. :) getevent is an android thing: on the one hand elliott hasn't removed it from the android part of the roadmap (which he maintains at this point), on the other there isn't one in toybox. *shrug*? zip and unzip are the next logical step after tar (which is in now), but I'd like to finish gzip deflate-side first. The "zcmp zdiff zegrep zfgrep zless zmore" family is just "gzip | command" and there should be some way to genericize it. Again, optional: you can just "zcat | grep" yourself, that's all just a convenience... And that's the list, triaged and explained. Anyway, taking aaaaaaaal that into account, I think I'm probably somewhere around 80% of the way towards the 1.0 release? It's not an exact thing, but a lot of what's left is simple or optional. There's a bunch of stuff there I'm not sure I _should_ do, and could easily trim them from the list to make a 1.0 release. A lot of other stuff is present in pending, what's in pending works, and I could in a pinch lower my standards to just accept it. (Not that I'm likely to, but the amount of cleanup work I need to do on them can vary. There's a lot more reading code than writing for those, which isn't _easier_ but means the resulting changes should be smaller.) The hardest and biggest and most important remaining thing is the shell, which I'm working on. Beyond that... vi is big but somebody's working on it, bc is enormous and would take a long time to clean up but I can also trivially patch it out of the kernel and nothing else uses it anywhere that I've found (you can thank Peter Anvin for objecting to the removal of perl by adding another gratuitous build dependency that Linux From Scratch and Gentoo had to add because they werne't previously building it becuase nothing anywhere used it and nobody had asked for a 40 year old desk calculator since most systems just run python and such if you need to get fancy...) The biggest lumps of work left for ME in the 1.0 roadmap sound like: rest of the shell awk that pile of networking commands and servers the init/login/mdev/syslogd pile. (Which includes useradd and friends.) gzip compression side and zip, plus cleaning up xzcat/lzma. opening the sha3 can of worms (it's currently building them by linking the crypto code out of openssl but I don't want the dependency). everything else Altogether it's a lot less than what I've already done. :) (Of course "make" isn't in the 1.0 roadmap, that's qcc. Getting android self-hosting through a minimal binary-auditable native build environment is the NEXT big can of worms. It would be nice if AOSP pulled the NDK as a build prerequisite, but they're not there yet...) Rob _______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net