Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
On Tue, 2018-11-13 at 00:45 +0100, Ulrich Mueller wrote: > > > > > > On Mon, 12 Nov 2018, Michał Górny wrote: > > Once tar is used for inner archive format, it is also a natural choice > > for the outer format. If you believe we should use another format, that > > is introduce a second distinct archive format and depend on a second > > tool, you need to have a good justification for it. > > Right, that's a better reason. :) > > > So yes, ar is an option, as well as cpio. In both cases the format is > > simpler (yet obscure), and the files are smaller. But does that justify > > using a second tool that serves the same purpose as tar, given that tar > > works and we need to use it anyway? Even if we skip the fact that ar is > > bundled as part of binutils rather than as stand-alone archiver, we're > > introducing unnecessarily complexity of learning a second tool. > > And both ar(1) and cpio(1) have weird CLI, compared to tar(1). > > cpio is not feasible because of file size limitations (4 GiB IIRC). > FWICS, ar has a limit of 10 decimal digits, so around 9.3 GiB. -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
> On Mon, 12 Nov 2018, Michał Górny wrote: > Once tar is used for inner archive format, it is also a natural choice > for the outer format. If you believe we should use another format, that > is introduce a second distinct archive format and depend on a second > tool, you need to have a good justification for it. Right, that's a better reason. :) > So yes, ar is an option, as well as cpio. In both cases the format is > simpler (yet obscure), and the files are smaller. But does that justify > using a second tool that serves the same purpose as tar, given that tar > works and we need to use it anyway? Even if we skip the fact that ar is > bundled as part of binutils rather than as stand-alone archiver, we're > introducing unnecessarily complexity of learning a second tool. > And both ar(1) and cpio(1) have weird CLI, compared to tar(1). cpio is not feasible because of file size limitations (4 GiB IIRC). Ulrich signature.asc Description: PGP signature
Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
On Mon, 2018-11-12 at 21:23 +0100, Ulrich Mueller wrote: > > > > > > On Mon, 12 Nov 2018, Michał Górny wrote: > > > Also, what would be wrong with ar? It's a standard POSIX tool, and > > > should be available everywhere. > > The original post says what's wrong with ar. Please be more specific > > if you disagree with it. > > AFAICS, the arguments are that ar would be obscure, and that the LSB > considers it deprecated. I don't find either of them convincing. > Since when do we care about the LSB? > Do you have a convincing arguments for using ar? I think it's quite obvious that tar is the only sane choice for the inner archive format since we need to preserve permissions, ownership etc. ar can't do it. Once tar is used for inner archive format, it is also a natural choice for the outer format. If you believe we should use another format, that is introduce a second distinct archive format and depend on a second tool, you need to have a good justification for it. So yes, ar is an option, as well as cpio. In both cases the format is simpler (yet obscure), and the files are smaller. But does that justify using a second tool that serves the same purpose as tar, given that tar works and we need to use it anyway? Even if we skip the fact that ar is bundled as part of binutils rather than as stand-alone archiver, we're introducing unnecessarily complexity of learning a second tool. And both ar(1) and cpio(1) have weird CLI, compared to tar(1). Plus, ar apparently doesn't support directories, so we end up adding extra complexity to get it unpacked sanely. For the record, I've did a little experiment and here are the results: -rw-r--r-- 1 mgorny mgorny 112928836 11-12 22:13 wine-any-3.20-1.gpkg.ar -rw-r--r-- 1 mgorny mgorny 112929280 11-12 22:21 wine-any-3.20-1.gpkg.cpio -rw-r--r-- 1 mgorny mgorny 112936960 11-12 22:11 wine-any-3.20-1.gpkg.tar So yes, we are saving around 8 KiB... out of 108 MiB. Of course, the savings may become relevant in case of tiny archives but do we really need to be concerned about that? The whole point of the proposal is to make the format simpler, easier to introspect and to modify. I believe limiting the number of formats in use certainly serves that purpose while starting to depend on obscure tools in order to save 8 KiB is a case of premature optimization. -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
On Mon, Nov 12, 2018 at 3:24 PM Ulrich Mueller wrote: > > On Mon, 12 Nov 2018, Michał Górny wrote: > > >> Also, what would be wrong with ar? It's a standard POSIX tool, and > >> should be available everywhere. > > > The original post says what's wrong with ar. Please be more specific > > if you disagree with it. > > AFAICS, the arguments are that ar would be obscure, and that the LSB > considers it deprecated. I don't find either of them convincing. > Since when do we care about the LSB? > I assert that it doesn't matter which tool we pick, so we have arbitrarily chosen tar because we like it. If you have a basis for preferring ar over tar; I'd love to hear it. I only brought it up because I know debian uses it. -A > > Ulrich >
Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
> On Mon, 12 Nov 2018, Michał Górny wrote: >> Also, what would be wrong with ar? It's a standard POSIX tool, and >> should be available everywhere. > The original post says what's wrong with ar. Please be more specific > if you disagree with it. AFAICS, the arguments are that ar would be obscure, and that the LSB considers it deprecated. I don't find either of them convincing. Since when do we care about the LSB? Ulrich signature.asc Description: PGP signature
Re: [gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
On Mon, 2018-11-12 at 18:33 +0100, Ulrich Mueller wrote: > > > > > > On Mon, 12 Nov 2018, Michał Górny wrote: > > On Mon, 2018-11-12 at 17:51 +0100, Fabian Groffen wrote: > > > I'm wondering here, how much sense does it make to compress 2., 3. > > > and/or 4. if you compress the whole gpkg? I have the impression > > > compression on compression isn't beneficial here. Shouldn't just > > > compressing of the gpkg tar be sufficient? > > Please read the spec again. It explicitly says it's not compressed. > > Isn't that the wrong way around? The tar format contains a lot of > padding, so using uncompressed tar for the outer archive would be > somewhat wasteful. Why not leave the inner tar files uncompressed, but > compress the whole binpkg instead? Uncompressed tar is mostly suitable for random access. Compressed tar isn't suitable for random access at all. With uncompressed tar, it's trivial to access one of the members. With compressed tar, you always end up decompressing everything. With uncompressed tar, it's easy to rewrite the metadata (read: apply package updates) without updating the rest. With compressed tar, you'd have to recompress all the huge packages in order to apply updates. > Also, what would be wrong with ar? It's a standard POSIX tool, and > should be available everywhere. > The original post says what's wrong with ar. Please be more specific if you disagree with it. -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
[gentoo-portage-dev] Re: [RFC] gpkg format proposal v2
> On Mon, 12 Nov 2018, Michał Górny wrote: > On Mon, 2018-11-12 at 17:51 +0100, Fabian Groffen wrote: >> I'm wondering here, how much sense does it make to compress 2., 3. >> and/or 4. if you compress the whole gpkg? I have the impression >> compression on compression isn't beneficial here. Shouldn't just >> compressing of the gpkg tar be sufficient? > Please read the spec again. It explicitly says it's not compressed. Isn't that the wrong way around? The tar format contains a lot of padding, so using uncompressed tar for the outer archive would be somewhat wasteful. Why not leave the inner tar files uncompressed, but compress the whole binpkg instead? Also, what would be wrong with ar? It's a standard POSIX tool, and should be available everywhere. Ulrich signature.asc Description: PGP signature