Re: [PATCH][UPDATED] support for xz compression format
Szymon Janc wrote: > Dnia wtorek 16 luty 2010 o 22:11:50 richardvo...@gmail.com napisał(a): > > >> Since gzip format allows decompression in pipeline mode, it's a >> virtual certainty that nothing from the footer is required for >> processing. >> >> And of course this is true of the xz format as well. I quote from the >> documentation: >> > > The only reason for checking footer is to get uncompressed data size to keep > grub_file_read() happy. > > Possible sollutions to avoid this: > - add size field in stream header and break compatibility with upstream > xz/gz, > will require forking upstream compression tools or create special tool for > crafting upstream created files > > - increase size while consuming blocks (possible with xz, don't know if with > gz), this leaves possibility to get grub_file_read() unhappy > > - try to guess uncompressed data size based on compressed size > > Solution number 4: make grub_file_read less exigent. If brainRAM serves well file size is used only for offset checking and grub_file_size. In first case offset checking can be effectivily disabled by setting size to 0x. Main use of grub_file_size is to allocate a buffer to load whole file at once. For these cases we may want to make a grub_file_malloc_read which would arbitrate between allocation strategies e.g. if fs->load_malloc is NULL use standard allocate/read, otherwise call load_malloc. It is possible that some other code uses grub_file_size (or file->size which is much uglier) in a slightly different way but I think this solution should cover most if not all cases. Specialised loaders may use guestimated compressed ratio, when it's overflown (unlikely if estimation method is good), save overflow separately by blocks and concatenate when done -- Regards Vladimir 'φ-coder/phcoder' Serbinenko signature.asc Description: OpenPGP digital signature ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
Dnia wtorek 16 luty 2010 o 22:11:50 richardvo...@gmail.com napisał(a): > Since gzip format allows decompression in pipeline mode, it's a > virtual certainty that nothing from the footer is required for > processing. > > And of course this is true of the xz format as well. I quote from the > documentation: The only reason for checking footer is to get uncompressed data size to keep grub_file_read() happy. Possible sollutions to avoid this: - add size field in stream header and break compatibility with upstream xz/gz, will require forking upstream compression tools or create special tool for crafting upstream created files - increase size while consuming blocks (possible with xz, don't know if with gz), this leaves possibility to get grub_file_read() unhappy - try to guess uncompressed data size based on compressed size -- Szymon K. Janc szy...@janc.net.pl // GG: 1383435 ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
On Tue, Feb 16, 2010 at 3:28 PM, Seth Goldberg wrote: > y > > Quoting richardvo...@gmail.com, who wrote the following on Tue, 16 Feb 2010: > >> On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc wrote: >>> >>> Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko >>> napisał(a): >>> + #define VLI_MAX_DIGITS 9 Are you sure it's 9? It gives only 63 and not 64 bits >>> >>> It is a limitation of xz format. >>> + if (! test_header(file) || ! test_footer(file)) + { Seeking to the end of file is very expensive on pxe. Can it be skipped? >>> >>> xz stores uncompressed data size at the end of file, without it file size >>> can't >>> be set. >>> >>> gzio has same issue, it reads last 4 bytes of file to determine >>> uncompressed >>> data size. >> >> Then maybe use of a more sensible format should be strongly >> encouraged. For sequentially accessed devices like tape, it may be >> expensive to return to the header to add the size. For flash memory, >> it may require an overwrite (and flash memory has fast random access, >> this is pretty much the only case to prefer a footer). And formats >> which support pipelining of course can't rewind to update a header. >> But in general the archive will be read many times for each time it is >> created, so it's best to put the size in the place which is best for >> the decompressor. >> >> Since gzip format allows decompression in pipeline mode, it's a >> virtual certainty that nothing from the footer is required for >> processing. > > I know I've said this before, but it bears repeating: Don't forget about > network booting! This is precisely the problem for which we came up with a > different solution for loading large compressed archives via tftp when PXE > booting. The assumption, of course, is that you can size a buffer maximally > (with some sane maximum) then shrink it once the archive has been > decompressed. That's what we do and it's served us will with Legacy GRUB > for years now. In the case of .xz, there appear to be block headers, where the uncompressed size is stored up front. So as long as separate blocks can go into independent memory allocations, there'd be no need to oversize the buffer. > > --S > > ___ > Grub-devel mailing list > Grub-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/grub-devel > > ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
y Quoting richardvo...@gmail.com, who wrote the following on Tue, 16 Feb 2010: On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc wrote: Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko napisał(a): + #define VLI_MAX_DIGITS 9 Are you sure it's 9? It gives only 63 and not 64 bits It is a limitation of xz format. + if (! test_header(file) || ! test_footer(file)) + { Seeking to the end of file is very expensive on pxe. Can it be skipped? xz stores uncompressed data size at the end of file, without it file size can't be set. gzio has same issue, it reads last 4 bytes of file to determine uncompressed data size. Then maybe use of a more sensible format should be strongly encouraged. For sequentially accessed devices like tape, it may be expensive to return to the header to add the size. For flash memory, it may require an overwrite (and flash memory has fast random access, this is pretty much the only case to prefer a footer). And formats which support pipelining of course can't rewind to update a header. But in general the archive will be read many times for each time it is created, so it's best to put the size in the place which is best for the decompressor. Since gzip format allows decompression in pipeline mode, it's a virtual certainty that nothing from the footer is required for processing. I know I've said this before, but it bears repeating: Don't forget about network booting! This is precisely the problem for which we came up with a different solution for loading large compressed archives via tftp when PXE booting. The assumption, of course, is that you can size a buffer maximally (with some sane maximum) then shrink it once the archive has been decompressed. That's what we do and it's served us will with Legacy GRUB for years now. --S ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc wrote: > Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko > napisał(a): > >> + #define VLI_MAX_DIGITS 9 >> Are you sure it's 9? It gives only 63 and not 64 bits > It is a limitation of xz format. > >> + if (! test_header(file) || ! test_footer(file)) >> + { >> Seeking to the end of file is very expensive on pxe. Can it be skipped? > xz stores uncompressed data size at the end of file, without it file size > can't > be set. > > gzio has same issue, it reads last 4 bytes of file to determine uncompressed > data size. Then maybe use of a more sensible format should be strongly encouraged. For sequentially accessed devices like tape, it may be expensive to return to the header to add the size. For flash memory, it may require an overwrite (and flash memory has fast random access, this is pretty much the only case to prefer a footer). And formats which support pipelining of course can't rewind to update a header. But in general the archive will be read many times for each time it is created, so it's best to put the size in the place which is best for the decompressor. Since gzip format allows decompression in pipeline mode, it's a virtual certainty that nothing from the footer is required for processing. And of course this is true of the xz format as well. I quote from the documentation: "Streamable: It is always possible to create and decompress .xz files in a pipe; no seeking is required." http://tukaani.org/xz/format.html ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko napisał(a): > + #define VLI_MAX_DIGITS 9 > Are you sure it's 9? It gives only 63 and not 64 bits It is a limitation of xz format. > + if (! test_header(file) || ! test_footer(file)) > + { > Seeking to the end of file is very expensive on pxe. Can it be skipped? xz stores uncompressed data size at the end of file, without it file size can't be set. gzio has same issue, it reads last 4 bytes of file to determine uncompressed data size. > grub_xzio_read seems to ignore file->offset. Have you tested seeking? Yes, seeking is not implemented yet. It's on my TODO list. -- Szymon K. Janc szy...@janc.net.pl // GG: 1383435 ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
Szymon Janc wrote: > On Tue, 16 Feb 2010 14:12:04 +0100 Vladimir 'φ-coder/phcoder' Serbinenko > wrote > > >> First of all: could you use unified diff? (-u option) >> > > Sure, will use that in future. > > >> grub2-1.98~experimental.20100120/conf/xzembed.rmk >> I don't see a need for either separate .rmk or separate module >> >> + static grub_uint8_t inbuf[XZBUFSIZ]; >> + static grub_uint8_t outbuf[XZBUFSIZ]; >> Avoid static variables. It will fail if user e.g. mounts xz file as a >> loopback then opens xz'ed file on this loopback. Just put buffers in >> grub_xzio >> + if (! file) >> + { >> + grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory"); >> + return 0; >> grub_error is already issued by grub_malloc, no need to put the same >> code here. >> > > I think You have reviewed old patch, not the updated one :-) > > Right. Many comments still apply though. Reviewing new one (in addition to old comments): + #define VLI_MAX_DIGITS 9 Are you sure it's 9? It gives only 63 and not 64 bits + if (! test_header(file) || ! test_footer(file)) + { Seeking to the end of file is very expensive on pxe. Can it be skipped? grub_xzio_read seems to ignore file->offset. Have you tested seeking? > > > ___ > Grub-devel mailing list > Grub-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/grub-devel > -- Regards Vladimir 'φ-coder/phcoder' Serbinenko signature.asc Description: OpenPGP digital signature ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
On Tue, 16 Feb 2010 14:12:04 +0100 Vladimir 'φ-coder/phcoder' Serbinenko wrote > First of all: could you use unified diff? (-u option) Sure, will use that in future. > grub2-1.98~experimental.20100120/conf/xzembed.rmk > I don't see a need for either separate .rmk or separate module > > + static grub_uint8_t inbuf[XZBUFSIZ]; > + static grub_uint8_t outbuf[XZBUFSIZ]; > Avoid static variables. It will fail if user e.g. mounts xz file as a > loopback then opens xz'ed file on this loopback. Just put buffers in > grub_xzio > + if (! file) > + { > + grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory"); > + return 0; > grub_error is already issued by grub_malloc, no need to put the same > code here. I think You have reviewed old patch, not the updated one :-) -- Szymon Janc ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
Re: [PATCH][UPDATED] support for xz compression format
Szymon Janc wrote: > Hello, > > New version of xz compression patch. > Changes since then: > - it works now :-) > - decoder dictionary can be enlarge up to DICT_BIT_SIZE defined in xz_lzma2.h > (currently set to 30 = 128MiB) > - xz uses crc32 from libgcrypt-grub, internal crc implementation removed > - removed linux kernel related code from xz > - get rid of some not needed memcpy() calls > - xzio.mod stuff made according to Vladimir's comments > - simplified wraper (only 1 file - xz_wrap.h) > - number of other improvments and tons of bugfixes > - files licence changed from public domain to gpl3+ > > This patch also changes gnulib-wrap.h true/false definitions to make them > truly > constants. > > TODO and other questions: > - lack of file seek support, if files is compressed with small block size, it > is > possible to implement pseudo-random access > - what is optimal i/o buffer size? like BUFSIZ macro in glibc > - default dictionary size should be chosen for files compression > - still need to do performance tests > - introduce some common layer for xzio/gzio etc ? > > Comments are welcome > > First of all: could you use unified diff? (-u option) # Misc. ! pkglib_MODULES += gzio.mod bufio.mod elf.mod xzio.mod Please put pkglib_MODULES += xzio.mod right before other variables grub2-1.98~experimental.20100120/conf/xzembed.rmk I don't see a need for either separate .rmk or separate module + static grub_uint8_t inbuf[XZBUFSIZ]; + static grub_uint8_t outbuf[XZBUFSIZ]; Avoid static variables. It will fail if user e.g. mounts xz file as a loopback then opens xz'ed file on this loopback. Just put buffers in grub_xzio + if (! file) + { + grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory"); + return 0; grub_error is already issued by grub_malloc, no need to put the same code here. + grub_memset (xzio, 0, sizeof (*xzio)); Use grub_zalloc instead of malloc + memset + xzio->buf.out_size = len >= XZBUFSIZ ? XZBUFSIZ : len; I suggest adding parenthesis around condition + /* feed input */ Please capitalise first letter and terminate the comments with a full stop and 2 spaces. + readret = grub_file_read(xzio->file,inbuf,XZBUFSIZ); Missing spaces before opening bracket and after commas. I recommend running indent on new files. > > > ___ > Grub-devel mailing list > Grub-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/grub-devel > -- Regards Vladimir 'φ-coder/phcoder' Serbinenko signature.asc Description: OpenPGP digital signature ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel
[PATCH][UPDATED] support for xz compression format
Hello, New version of xz compression patch. Changes since then: - it works now :-) - decoder dictionary can be enlarge up to DICT_BIT_SIZE defined in xz_lzma2.h (currently set to 30 = 128MiB) - xz uses crc32 from libgcrypt-grub, internal crc implementation removed - removed linux kernel related code from xz - get rid of some not needed memcpy() calls - xzio.mod stuff made according to Vladimir's comments - simplified wraper (only 1 file - xz_wrap.h) - number of other improvments and tons of bugfixes - files licence changed from public domain to gpl3+ This patch also changes gnulib-wrap.h true/false definitions to make them truly constants. TODO and other questions: - lack of file seek support, if files is compressed with small block size, it is possible to implement pseudo-random access - what is optimal i/o buffer size? like BUFSIZ macro in glibc - default dictionary size should be chosen for files compression - still need to do performance tests - introduce some common layer for xzio/gzio etc ? Comments are welcome -- Szymon K. Janc szy...@janc.net.pl // GG: 1383435 xz.diff.gz Description: GNU Zip compressed data ___ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel