BSD tar (was Re: Making pkg_XXX tools smarter about file types...)
On Friday 28 March 2003 22:47, Tim Kientzle wrote: P.S. It's galled me for a while that pkg_add has to fork 'tar' to extract the archive. Me too, me too. I've started piecing together a library that reads/writes tarfiles. Excellent. A general design goal in userland should be to implement functionality in libraries and then wrap small driver programs around them to export the basic functionality to userland. I guess this partly violates the original UNIX tools philosophy, but all it really does is move it from the original pipes interfaces into the dynamic linker. With this, it should be possible to make pkg_add considerably more efficient. In particular, rather than extracting to a temp directory, then parsing important information, then moving the files, it should be possible using this library to read the initial entries (+CONTENTS, in particular) directly into memory, process the information there, then extract the remainder of the package files directly into their final locations. I'd much rather see the metadata moved outside the file archives, but that's a separate argument and in now way detracts from your proposed work. ;^) So far, I have a library API outlined, and functional read support implemented. Next step is to hack up a minimal tar implementation that uses it to make sure everything's working correctly. So far, the library automatically detects compression formats (using techniques like those in my pkg_install patch) and has some rough support for detecting the archive format as well. (One goal of mine: support for 'pax extended archives', which I understand can handle ACLs.) I have wondered outloud before if pax might be a suitable starting place for such a hacking expedition. Others who've worked on pax assure me it is not. ;^( Sigh. So much code, so few programmers. Of course, such a library could also form the basis for a BSD-licensed tar to replace GNU tar. I understand a few people have wanted such a thing. Why yes, yes they have. -- Where am I, and what am I doing in this handbasket? Wes Peters [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Making pkg_XXX tools smarter about file types...
Brandon D. Valentine wrote: On Fri, Mar 28, 2003 at 10:47:43PM -0800, Tim Kientzle wrote: P.S. It's galled me for a while that pkg_add has to fork 'tar' to extract the archive. I've started piecing together a library that reads/writes tarfiles. FYI, libtar[0] is BSD-licensed and might be useful to such a project. [0] - http://www-dev.cites.uiuc.edu/libtar/ Thanks for the pointer. I took a look, and there are some good ideas there, although my current libtarfile work has a few features that libtar lacks. Cribbing from libtar could be useful, though. (I'm also using John Gilmore's old PD tar as a source of ideas.) Tim ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Making pkg_XXX tools smarter about file types...
If memory serves me right, Tim Kientzle wrote: The attached patch modifies the pkg_install tools to inspect the file contents--rather than the filename extension--to determine the compression method in use. It then feeds the data into the correct invocation of 'tar'. I've also modified exec.c/lib.h to factor out and expose some common code that formats shell command lines. This approach makes it possible, for instance, to fix a single file extension (e.g. '.freebsd' or '.package') that would not have to change even if the internal format of a package were to change (as has already occurred once, with the transition from gzip to bzip2 compression). Note that this could also be fairly easily extended to support a variety of alternative archive types. (E.g., the pkg_XXX tools could be modified to support 'zip' or 'tar' archives transparently to the user.) (A month and a half passes...I meant to get back to you earlier but didn't have any time to play with this before. Actually I still don't, but...) The concept is good, and it's something we've needed for awhile. I suspect you followed the various adventures of pkg_add and sysinstall when we tried supporting both bzip2 and gzip packages for various releases and developer previews, before we settled on the current bzip2 for 5.X and gzip for 4.X as something that actually worked. A little feedback on the patch itself (functionality only): Basically, it works great for the case of a package coming in on stdin. If the package comes from a file, then pkg_add wants to make two passes over the package, first to get the +CONTENTS file and second to actually unpack everything. When the first tar process finishes reading the +CONTENTS file, it closes its pipe (due to the --fast-read argument). However, pkg_add still seems to be writing to the pipe...this seems to be bad. An example with pkg_add built with CFLAGS=-DDEBUG: tomcat:add# cat ~/tmp/bash.pkg | ./pkg_add - Piping package '-' to cmd 'tar -xpjf - ' updating /etc/shells Executing /usr/sbin/mtree -U -f +MTREE_DIRS -d -e -p /usr/local /dev/null Executing mkdir /var/db/pkg/bash-2.05b.004 Executing chmod a+rx /var/db/pkg/bash-2.05b.004 Executing mv ./+DESC /var/db/pkg/bash-2.05b.004 Executing mv ./+COMMENT /var/db/pkg/bash-2.05b.004 Executing mv ./+MTREE_DIRS /var/db/pkg/bash-2.05b.004 Executing rm -rf /var/tmp/instmp.BGdXjm tomcat:add# ./pkg_add ~/tmp/bash.pkg Piping package '/usr/users/bmah/tmp/bash.pkg' to cmd 'tar -xpjf - --fast-read - +CONTENTS' Broken pipe It works if I remove the --fast-read flag from the tar, but that's not the right answer. Bruce. pgp0.pgp Description: PGP signature ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Making pkg_XXX tools smarter about file types...
Bruce A. Mah wrote: If memory serves me right, Tim Kientzle wrote: The attached patch modifies the pkg_install tools to inspect the file contents--rather than the filename extension--to determine the compression method in use. The concept is good, and it's something we've needed for awhile. I suspect you followed the various adventures of pkg_add and sysinstall when we tried supporting both bzip2 and gzip packages ... Yes, those adventures were a large part of what motivated me to implement this auto-detect logic. ... If the package comes from a file, then pkg_add wants to make two passes over the package, first to get the +CONTENTS file and second to actually unpack everything. When the first tar process finishes reading the +CONTENTS file, it closes its pipe (due to the --fast-read argument). However, pkg_add still seems to be writing to the pipe...this seems to be bad. It works if I remove the --fast-read flag from the tar, but that's not the right answer. No. That's clearly not the right answer. Seems I forgot to check for an error return from fwrite(). Easy enough to fix; near the end of unpack() in lib/file.c, change the read/write loop to the following: while(buff_size 0) { if(buff_size fwrite(buff,1,buff_size,out_pipe)) break; buff_size = fread(buff,1,buff_allocation,pkg_file); } This aborts the passthrough if the pipe is closed. Modified diff attached. Tim P.S. It's galled me for a while that pkg_add has to fork 'tar' to extract the archive. I've started piecing together a library that reads/writes tarfiles. With this, it should be possible to make pkg_add considerably more efficient. In particular, rather than extracting to a temp directory, then parsing important information, then moving the files, it should be possible using this library to read the initial entries (+CONTENTS, in particular) directly into memory, process the information there, then extract the remainder of the package files directly into their final locations. So far, I have a library API outlined, and functional read support implemented. Next step is to hack up a minimal tar implementation that uses it to make sure everything's working correctly. So far, the library automatically detects compression formats (using techniques like those in my pkg_install patch) and has some rough support for detecting the archive format as well. (One goal of mine: support for 'pax extended archives', which I understand can handle ACLs.) Of course, such a library could also form the basis for a BSD-licensed tar to replace GNU tar. I understand a few people have wanted such a thing. Index: lib/exec.c === RCS file: /usr/src/cvs/src/usr.sbin/pkg_install/lib/exec.c,v retrieving revision 1.10 diff -c -r1.10 exec.c *** lib/exec.c 1 Apr 2002 09:39:07 - 1.10 --- lib/exec.c 29 Mar 2003 06:00:36 - *** *** 25,59 #include err.h /* ! * Unusual system() substitute. Accepts format string and args, ! * builds and executes command. Returns exit code. */ ! ! int ! vsystem(const char *fmt, ...) { ! va_list args; char *cmd; - int ret, maxargs; maxargs = sysconf(_SC_ARG_MAX); maxargs -= 32;/* some slop for the sh -c */ cmd = malloc(maxargs); if (!cmd) { ! warnx(vsystem can't alloc arg space); ! return 1; } - - va_start(args, fmt); if (vsnprintf(cmd, maxargs, fmt, args) maxargs) { ! warnx(vsystem args are too long); ! return 1; } #ifdef DEBUG printf(Executing %s\n, cmd); #endif ret = system(cmd); - va_end(args); free(cmd); return ret; } --- 25,83 #include err.h /* ! * Format a command, allocating a buffer along the way. */ ! static char * ! va_system_cmd(const char *fmt, va_list args) { ! int maxargs; char *cmd; maxargs = sysconf(_SC_ARG_MAX); maxargs -= 32;/* some slop for the sh -c */ cmd = malloc(maxargs); if (!cmd) { ! warnx(can't allocate memory to format program command line); ! return NULL; } if (vsnprintf(cmd, maxargs, fmt, args) maxargs) { ! warnx(argument list is too long); ! return NULL; } + return cmd; + } + + char * + system_cmd(const char *fmt, ...) + { + va_list args; + char *cmd; + + va_start(args, fmt); + cmd = va_system_cmd(fmt,args); + va_end(args); + return cmd; + } + + /* + * Unusual system() substitute. Accepts format string and args, + * builds and executes command. Returns exit code. + */ + int + vsystem(const char *fmt, ...) + { + va_list args; + char *cmd; + int ret; + + va_start(args, fmt); + cmd = va_system_cmd(fmt,args); + va_end(args); + if(cmd == NULL) return 1; #ifdef DEBUG printf(Executing %s\n, cmd); #endif ret = system(cmd);
Re: Making pkg_XXX tools smarter about file types...
Bruce A. Mah wrote: The attached patch modifies the pkg_install tools to inspect the file contents--rather than the filename extension--to determine the compression method in use. ... it works great for the case of a package coming in on stdin. If the package comes from a file, ... Oh, bloody hell. I overlooked 'pkg_add -r', too, didn't I? Shouldn't take too long to add the auto-detect logic to the remote fetch handling, as well. sigh Tim ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Making pkg_XXX tools smarter about file types...
On Fri, Mar 28, 2003 at 10:47:43PM -0800, Tim Kientzle wrote: P.S. It's galled me for a while that pkg_add has to fork 'tar' to extract the archive. I've started piecing together a library that reads/writes tarfiles. With this, it should be possible to make pkg_add considerably more efficient. In particular, rather than extracting to a temp directory, then parsing important information, then moving the files, it should be possible using this library to read the initial entries (+CONTENTS, in particular) directly into memory, process the information there, then extract the remainder of the package files directly into their final locations. So far, I have a library API outlined, and functional read support implemented. Next step is to hack up a minimal tar implementation that uses it to make sure everything's working correctly. So far, the library automatically detects compression formats (using techniques like those in my pkg_install patch) and has some rough support for detecting the archive format as well. (One goal of mine: support for 'pax extended archives', which I understand can handle ACLs.) Of course, such a library could also form the basis for a BSD-licensed tar to replace GNU tar. I understand a few people have wanted such a thing. FYI, libtar[0] is BSD-licensed and might be useful to such a project. [0] - http://www-dev.cites.uiuc.edu/libtar/ Brandon D. Valentine -- [EMAIL PROTECTED] http://www.geekpunk.net Pseudo-Random Googlism: valentine is a champion of the true small online business ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Making pkg_XXX tools smarter about file types...
Yury Tarasievich [EMAIL PROTECTED]: ...then, Tim Kientzle wrote: A better approach might be to simply fob it off on the user, i.e., # pkg_install foo-1.5 Warning: foo-1.5 requires bar-2.3, you have bar-1.7 installed. Proceed? [Y/n] i think this is the best approach. In my opinion, user should be bothered with choices *only* when, like in this example, when dependency isn't *at* *all* satisfied. User definitely should *not* be bothered when differences are irrelevant to the functionality. E.g., ask only when bar-1.7 is installed and 2.3+ required, not when bar-1.7 is installed and say 1.4.1+ is required. but i've seen libraries/interfaces changed dramatically. i faintly remember a package which would not link to Qt-2, but insisted on Qt-1 beeing used. I think dependencies could / should also have *upper* revision limit (library interface change, e.g.). And there could also be functionality of system-wide dependencies updating (isn't there one?) but you cannot possibly know at what version number in the future some API will change. anything like this could only make sense if an API is described in terms of functionality needed, at a much finer grained level than version numbers. I've seen interesting concept of version number processing by D.J.Bernstein (called slashpackage, I believe). slashpackage doesn't really solve this problem, it is just a more rigorous framework. but since many of DJBs followers make programs as small as possible, with functionality spread over several programs where others make one big program, the granularity is in fact smaller. Title: slashpackage.html URL:http://cr.yp.to/slashpackage.html Last Modified: Mon Jul 16 00:24:39 2001 Title: Google Search: link:http://cr.yp.to/slashpackage.html URL:http://www.google.com/search?q=link:http://cr.yp.to/slashpackage.html especially: Title: idtools URL:http://multivac.cwru.edu/idtools/ Last Modified: Mon Jan 13 23:03:02 2003 (there's another packaging system for /package which i can't remember the name of, unfortunately.) the Installation of every package in /package is always the same: Create /package if necessary, unpack the tarball there, and run package/build: # mkdir -p /usr/local/package# Any filesystem will do... # ln -s /usr/local/package / # as long as it's visible as /package # chmod 01755 /package/. # cd /package # bunzip2 /path/to/admin_idtools-VERSION.tar.bz2 | tar -xpf - # cd admin/idtools-VERSION # package/build # sp-version /package/admin idtools VERSION # sp-links /package/admin/idtools/command /command /usr/local/bin Read package/README and package/INSTALL for more detailed instructions. the last two commands are missing in packages not relying on paul jarcs idtools, and most packages provide a comprehensive script called ./package/install containing the building and installation. so far i have installed dozens of packages from different authors for quite diverse purposes, and the mechanism rarely failed, and if it did, it was for reasons like a forgotten include or somesuch. note that none of the packages use autoconf, AFAIR. clemens To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
clemens fischer wrote: Yury Tarasievich [EMAIL PROTECTED]: ...then, Tim Kientzle wrote: A better approach might be to simply fob it off on the user, i.e., # pkg_install foo-1.5 Warning: foo-1.5 requires bar-2.3, you have bar-1.7 installed. Proceed? [Y/n] i think this is the best approach. I still disagree with that partially (as in quote below). In my opinion, user should be bothered with choices *only* when, like in this example, when dependency isn't *at* *all* satisfied. User definitely should *not* be bothered when differences are irrelevant to the functionality. E.g., ask only when bar-1.7 is installed and 2.3+ required, not when bar-1.7 is installed and say 1.4.1+ is required. but i've seen libraries/interfaces changed dramatically. i faintly remember a package which would not link to Qt-2, but insisted on Qt-1 beeing used. So have dependency list insist on having available qt-1no-higher. Not the most appropriate example, either (I would have thought of dividing line like method changed behaviour over some revision number). QT's have different naming schemes (IIRC) and different functionalities, effectively being 3 different packages. I think dependencies could / should also have *upper* revision limit (library interface change, e.g.). And there could also be functionality of system-wide dependencies updating (isn't there one?) but you cannot possibly know at what version number in the future some API will change. anything like this could only make sense if an API is described in terms of functionality needed, at a much finer grained level than version numbers. I agree with that but then I don't see what is *the* problem? I believe it *is* known what functionality gets changed and how, when package goes through revision change? I've seen interesting concept of version number processing by D.J.Bernstein (called slashpackage, I believe). slashpackage doesn't really solve this problem, it is just a more rigorous framework. but since many of DJBs followers make programs as [...] No, I was thinking about that (citing cr.yp.to/slashpackages/versions.html): Which version is newer? Version numbers are required to start with digits, but they aren't required to follow any particular numbering system. In particular, they aren't required to increase lexicographically. One package might have versions 2, 2.01, ..., 2.09, 2.1, 2.11, ..., 2.89, 2.9, etc., while another package has versions 2.0, 2.1, 2.2, ..., 2.9, 2.10, 2.11, etc.; 2.11 may be before or after 2.9. If you're building a package, you can include a file ./package/versions that lists all the version numbers you've used so far, one per line, in order. Then scripts can compare two versions to see which one is newer. For example, if /package/admin/daemontools-0.80/ package/versions says 0.75 0.76 0.80 while /package/admin/dameontools-0.92/package/versions says 0.75 0.76 0.80 0.81 0.90 0.91 0.92 then 0.92 is newer. This file also makes it possible to reliably handle dependencies such as ``you must have version 0.81 or newer.'' To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
...first, clemens fischer wrote: Yury Tarasievich [EMAIL PROTECTED]: I'd like to see in dependencies not only like was built with -1.9_2abc, so wants it, but also something like -1.5+ (obviously 1.5.0 and newer), -* (any version will do). Perhaps something else. At least to have possibility of specifying that, if this can't go into official ports. Does it seem reasonable? this problem has been annoying me for ages, but he who implements this should consider dependencies specified too liberally. sometimes newer versions aren't backwards compatible, which you can't know back in the past. Well, someone *should* pay *some* attention to what he's porting, right? And I've seen some ports even aren't compliant with hier(7), too. ...then, Tim Kientzle wrote: A better approach might be to simply fob it off on the user, i.e., # pkg_install foo-1.5 Warning: foo-1.5 requires bar-2.3, you have bar-1.7 installed. Proceed? [Y/n] In my opinion, user should be bothered with choices *only* when, like in this example, when dependency isn't *at* *all* satisfied. User definitely should *not* be bothered when differences are irrelevant to the functionality. E.g., ask only when bar-1.7 is installed and 2.3+ required, not when bar-1.7 is installed and say 1.4.1+ is required. I think dependencies could / should also have *upper* revision limit (library interface change, e.g.). And there could also be functionality of system-wide dependencies updating (isn't there one?) I've seen interesting concept of version number processing by D.J.Bernstein (called slashpackage, I believe). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
Yury Tarasievich [EMAIL PROTECTED]: I'd like to see in dependencies not only like was built with -1.9_2abc, so wants it, but also something like -1.5+ (obviously 1.5.0 and newer), -* (any version will do). Perhaps something else. At least to have possibility of specifying that, if this can't go into official ports. Does it seem reasonable? this problem has been annoying me for ages, but he who implements this should consider dependencies specified too liberally. sometimes newer versions aren't backwards compatible, which you can't know back in the past. clemens To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
clemens fischer wrote: Yury Tarasievich [EMAIL PROTECTED]: I'd like to see in dependencies not only like was built with -1.9_2abc, so wants it, but also something like -1.5+ (obviously 1.5.0 and newer), -* (any version will do). ... sometimes newer versions aren't backwards compatible, which you can't know back in the past. A better approach might be to simply fob it off on the user, i.e., # pkg_install foo-1.5 Warning: foo-1.5 requires bar-2.3, you have bar-1.7 installed. Proceed? [Y/n] Eventually, someone should teach pkg_install about fetch, so it could offer the option of downloading and installing the dependency. IIRC, Debian's package system handled this pretty elegantly---it even upgraded installed packages automatically---although their curses-based UI was ... erm quirky. ;-) Tim Kientzle To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
On Sun, Feb 09, 2003 at 10:42:21AM -0800, Tim Kientzle wrote: Eventually, someone should teach pkg_install about fetch, so it could offer the option of downloading and installing the dependency. IIRC, Debian's package system handled this pretty elegantly---it even upgraded installed packages automatically---although their curses-based UI was ... erm quirky. ;-) you mean like ''pkg_add -r'' already does? -- - bill fumerola / [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Making pkg_XXX tools smarter about file types...
If memory serves me right, Tim Kientzle wrote: The attached patch modifies the pkg_install tools to inspect the file contents--rather than the filename extension--to determine the compression method in use. It then feeds the data into the correct invocation of 'tar'. I've also modified exec.c/lib.h to factor out and expose some common code that formats shell command lines. Cool. I ran into problems that would have been solved by this when trying to put together some of the RC snapshots for both 4.7 and 5.0 (it sucked to put together a test release and package split only to discover that none of it worked). I'm not sure when I'll get a chance to look at your diff in detail (or if there might be other people planning on reviewing this) but I'm definitely interested in seeing the functionality. Thanks! Bruce. msg39698/pgp0.pgp Description: PGP signature
Re: Making pkg_XXX tools smarter about file types...
I have yet another suggestion regarding packaging subsystem -- could it be possible to extend pkg_* functionality (and, in fact, ports functionality) to recognize modest set of wildcards in dependencies names? It seems pretty unreasonable to have various subrevisions of, say, libiconv, pulled as packages dependencies ending with libiconv-1.7_5, libiconv-1.8, libiconv-1.8_1 and installed. And still another package would complain about having no -1.8_2. I'd like to see in dependencies not only like was built with -1.9_2abc, so wants it, but also something like -1.5+ (obviously 1.5.0 and newer), -* (any version will do). Perhaps something else. At least to have possibility of specifying that, if this can't go into official ports. Does it seem reasonable? Tim Kientzle wrote: The attached patch modifies the pkg_install tools to inspect the file contents--rather than the [...] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message