Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Vladimir 'φ-coder/phcoder' Serbinenko
Szymon Janc wrote:
> Dnia wtorek 16 luty 2010 o 22:11:50 richardvo...@gmail.com napisał(a):
>
>   
>> Since gzip format allows decompression in pipeline mode, it's a
>> virtual certainty that nothing from the footer is required for
>> processing.
>>
>> And of course this is true of the xz format as well.  I quote from the
>> documentation:
>> 
>
> The only reason for checking footer is to get uncompressed data size to keep 
> grub_file_read() happy.
>
> Possible sollutions to avoid this:
> - add size field in stream header and break compatibility with upstream 
> xz/gz, 
> will require forking upstream compression tools or create special tool for 
> crafting upstream created files
>   
> - increase size while consuming blocks (possible with xz, don't know if with 
> gz), this leaves possibility to get grub_file_read() unhappy
>
> - try to guess uncompressed data size based on compressed size
>
>   
Solution number 4: make grub_file_read less exigent. If brainRAM serves
well file size is used only for offset checking and grub_file_size. In
first case offset checking can be effectivily disabled by setting size
to 0x. Main use of grub_file_size is to allocate a
buffer to load whole file at once. For these cases we may want to make a
grub_file_malloc_read which would arbitrate between allocation
strategies e.g. if fs->load_malloc is NULL use standard allocate/read,
otherwise call load_malloc. It is possible that some other code uses
grub_file_size (or file->size which is much uglier) in a slightly
different way but I think this solution should cover most if not all cases.
Specialised loaders may use guestimated compressed ratio, when it's
overflown (unlikely if estimation method is good), save overflow
separately by blocks and concatenate when done


-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Szymon Janc
Dnia wtorek 16 luty 2010 o 22:11:50 richardvo...@gmail.com napisał(a):

> Since gzip format allows decompression in pipeline mode, it's a
> virtual certainty that nothing from the footer is required for
> processing.
> 
> And of course this is true of the xz format as well.  I quote from the
> documentation:

The only reason for checking footer is to get uncompressed data size to keep 
grub_file_read() happy.

Possible sollutions to avoid this:
- add size field in stream header and break compatibility with upstream xz/gz, 
will require forking upstream compression tools or create special tool for 
crafting upstream created files
  
- increase size while consuming blocks (possible with xz, don't know if with 
gz), this leaves possibility to get grub_file_read() unhappy

- try to guess uncompressed data size based on compressed size

-- 
Szymon K. Janc
szy...@janc.net.pl // GG: 1383435


___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread richardvo...@gmail.com
On Tue, Feb 16, 2010 at 3:28 PM, Seth Goldberg  wrote:
> y
>
> Quoting richardvo...@gmail.com, who wrote the following on Tue, 16 Feb 2010:
>
>> On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc  wrote:
>>>
>>> Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko
>>> napisał(a):
>>>
 + #define VLI_MAX_DIGITS 9
 Are you sure it's 9? It gives only 63 and not 64 bits
>>>
>>> It is a limitation of xz format.
>>>
 +   if (! test_header(file) || ! test_footer(file))
 +   {
 Seeking to the end of file is very expensive on pxe. Can it be skipped?
>>>
>>> xz stores uncompressed data size at the end of file, without it file size
>>> can't
>>> be set.
>>>
>>> gzio has same issue, it reads last 4 bytes of file to determine
>>> uncompressed
>>> data size.
>>
>> Then maybe use of a more sensible format should be strongly
>> encouraged.  For sequentially accessed devices like tape, it may be
>> expensive to return to the header to add the size.  For flash memory,
>> it may require an overwrite (and flash memory has fast random access,
>> this is pretty much the only case to prefer a footer).  And formats
>> which support pipelining of course can't rewind to update a header.
>> But in general the archive will be read many times for each time it is
>> created, so it's best to put the size in the place which is best for
>> the decompressor.
>>
>> Since gzip format allows decompression in pipeline mode, it's a
>> virtual certainty that nothing from the footer is required for
>> processing.
>
>  I know I've said this before, but it bears repeating: Don't forget about
> network booting!  This is precisely the problem for which we came up with a
> different solution for loading large compressed archives via tftp when PXE
> booting.  The assumption, of course, is that you can size a buffer maximally
> (with some sane maximum)  then shrink it once the archive has been
> decompressed.  That's what we do and it's served us will with Legacy GRUB
> for years now.

In the case of .xz, there appear to be block headers, where the
uncompressed size is stored up front.  So as long as separate blocks
can go into independent memory allocations, there'd be no need to
oversize the buffer.

>
>  --S
>
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/grub-devel
>
>


___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Seth Goldberg

y

Quoting richardvo...@gmail.com, who wrote the following on Tue, 16 Feb 2010:


On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc  wrote:

Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko
napisał(a):


+ #define VLI_MAX_DIGITS 9
Are you sure it's 9? It gives only 63 and not 64 bits

It is a limitation of xz format.


+   if (! test_header(file) || ! test_footer(file))
+   {
Seeking to the end of file is very expensive on pxe. Can it be skipped?

xz stores uncompressed data size at the end of file, without it file size can't
be set.

gzio has same issue, it reads last 4 bytes of file to determine uncompressed
data size.


Then maybe use of a more sensible format should be strongly
encouraged.  For sequentially accessed devices like tape, it may be
expensive to return to the header to add the size.  For flash memory,
it may require an overwrite (and flash memory has fast random access,
this is pretty much the only case to prefer a footer).  And formats
which support pipelining of course can't rewind to update a header.
But in general the archive will be read many times for each time it is
created, so it's best to put the size in the place which is best for
the decompressor.

Since gzip format allows decompression in pipeline mode, it's a
virtual certainty that nothing from the footer is required for
processing.


  I know I've said this before, but it bears repeating: Don't forget about 
network booting!  This is precisely the problem for which we came up with a 
different solution for loading large compressed archives via tftp when PXE 
booting.  The assumption, of course, is that you can size a buffer maximally 
(with some sane maximum)  then shrink it once the archive has been 
decompressed.  That's what we do and it's served us will with Legacy GRUB for 
years now.


 --S
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread richardvo...@gmail.com
On Tue, Feb 16, 2010 at 12:58 PM, Szymon Janc  wrote:
> Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko
> napisał(a):
>
>> + #define VLI_MAX_DIGITS 9
>> Are you sure it's 9? It gives only 63 and not 64 bits
> It is a limitation of xz format.
>
>> +   if (! test_header(file) || ! test_footer(file))
>> +   {
>> Seeking to the end of file is very expensive on pxe. Can it be skipped?
> xz stores uncompressed data size at the end of file, without it file size 
> can't
> be set.
>
> gzio has same issue, it reads last 4 bytes of file to determine uncompressed
> data size.

Then maybe use of a more sensible format should be strongly
encouraged.  For sequentially accessed devices like tape, it may be
expensive to return to the header to add the size.  For flash memory,
it may require an overwrite (and flash memory has fast random access,
this is pretty much the only case to prefer a footer).  And formats
which support pipelining of course can't rewind to update a header.
But in general the archive will be read many times for each time it is
created, so it's best to put the size in the place which is best for
the decompressor.

Since gzip format allows decompression in pipeline mode, it's a
virtual certainty that nothing from the footer is required for
processing.

And of course this is true of the xz format as well.  I quote from the
documentation:

"Streamable: It is always possible to create and decompress .xz files
in a pipe; no seeking is required."
http://tukaani.org/xz/format.html


___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Szymon Janc
Dnia wtorek 16 luty 2010 o 14:44:45 Vladimir 'φ-coder/phcoder' Serbinenko 
napisał(a):

> + #define VLI_MAX_DIGITS 9
> Are you sure it's 9? It gives only 63 and not 64 bits
It is a limitation of xz format.

> +   if (! test_header(file) || ! test_footer(file))
> +   {
> Seeking to the end of file is very expensive on pxe. Can it be skipped?
xz stores uncompressed data size at the end of file, without it file size can't 
be set.

gzio has same issue, it reads last 4 bytes of file to determine uncompressed 
data size.

> grub_xzio_read seems to ignore file->offset. Have you tested seeking?
Yes, seeking is not implemented yet. It's on my TODO list.


-- 
Szymon K. Janc
szy...@janc.net.pl // GG: 1383435


___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Vladimir 'φ-coder/phcoder' Serbinenko
Szymon Janc wrote:
> On Tue, 16 Feb 2010 14:12:04 +0100  Vladimir 'φ-coder/phcoder' Serbinenko 
>  wrote
>
>   
>> First of all: could you use unified diff? (-u option)
>> 
>
> Sure, will use that in future.
>
>   
>> grub2-1.98~experimental.20100120/conf/xzembed.rmk
>> I don't see a need for either separate .rmk or separate module
>>
>> + static grub_uint8_t inbuf[XZBUFSIZ];
>> + static grub_uint8_t outbuf[XZBUFSIZ];
>> Avoid static variables. It will fail if user e.g. mounts xz file as a
>> loopback then opens xz'ed file on this loopback. Just put buffers in
>> grub_xzio
>> +   if (! file)
>> +   {
>> +   grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory");
>> + return 0;
>> grub_error is already issued by grub_malloc, no need to put the same
>> code here.
>> 
>
> I think You have reviewed old patch, not the updated one :-)
>
>   
Right. Many comments still apply though. Reviewing new one (in addition
to old comments):
+ #define VLI_MAX_DIGITS 9
Are you sure it's 9? It gives only 63 and not 64 bits
+   if (! test_header(file) || ! test_footer(file))
+   {
Seeking to the end of file is very expensive on pxe. Can it be skipped?
grub_xzio_read seems to ignore file->offset. Have you tested seeking?

> 
>
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/grub-devel
>   


-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Szymon Janc
On Tue, 16 Feb 2010 14:12:04 +0100  Vladimir 'φ-coder/phcoder' Serbinenko 
 wrote

> First of all: could you use unified diff? (-u option)

Sure, will use that in future.

> grub2-1.98~experimental.20100120/conf/xzembed.rmk
> I don't see a need for either separate .rmk or separate module
> 
> + static grub_uint8_t inbuf[XZBUFSIZ];
> + static grub_uint8_t outbuf[XZBUFSIZ];
> Avoid static variables. It will fail if user e.g. mounts xz file as a
> loopback then opens xz'ed file on this loopback. Just put buffers in
> grub_xzio
> +   if (! file)
> +   {
> +   grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory");
> + return 0;
> grub_error is already issued by grub_malloc, no need to put the same
> code here.

I think You have reviewed old patch, not the updated one :-)

-- 
Szymon Janc
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH][UPDATED] support for xz compression format

2010-02-16 Thread Vladimir 'φ-coder/phcoder' Serbinenko
Szymon Janc wrote:
> Hello,
>
> New version of xz compression patch.
> Changes since then:
> - it works now :-)
> - decoder dictionary can be enlarge up to DICT_BIT_SIZE defined in xz_lzma2.h
>  (currently set to 30 = 128MiB)
> - xz uses crc32 from libgcrypt-grub, internal crc implementation removed
> - removed linux kernel related code from xz
> - get rid of some not needed memcpy() calls
> - xzio.mod stuff made according to Vladimir's comments
> - simplified wraper (only 1 file - xz_wrap.h)
> - number of other improvments and tons of bugfixes
> - files licence changed from public domain to gpl3+
>
> This patch also changes gnulib-wrap.h true/false definitions to make them 
> truly
> constants.
>
> TODO and other questions:
> - lack of file seek support, if files is compressed with small block size, it 
> is 
>   possible to implement pseudo-random access
> - what is optimal i/o buffer size? like BUFSIZ macro in glibc
> - default dictionary size should be chosen for files compression
> - still need to do performance tests
> - introduce some common layer for xzio/gzio etc ?
>
> Comments are welcome
>
>   
First of all: could you use unified diff? (-u option)
  # Misc.
! pkglib_MODULES += gzio.mod bufio.mod elf.mod xzio.mod
Please put pkglib_MODULES += xzio.mod right before other variables

grub2-1.98~experimental.20100120/conf/xzembed.rmk
I don't see a need for either separate .rmk or separate module

+ static grub_uint8_t inbuf[XZBUFSIZ];
+ static grub_uint8_t outbuf[XZBUFSIZ];
Avoid static variables. It will fail if user e.g. mounts xz file as a
loopback then opens xz'ed file on this loopback. Just put buffers in
grub_xzio
+   if (! file)
+   {
+   grub_error (GRUB_ERR_OUT_OF_MEMORY, "out of memory");
+ return 0;
grub_error is already issued by grub_malloc, no need to put the same
code here.
+   grub_memset (xzio, 0, sizeof (*xzio));
Use grub_zalloc instead of malloc + memset
+   xzio->buf.out_size = len >= XZBUFSIZ ? XZBUFSIZ : len;
I suggest adding parenthesis around condition
+ /* feed input */
Please capitalise first letter and terminate the comments with a full
stop and 2 spaces.
+ readret = grub_file_read(xzio->file,inbuf,XZBUFSIZ);
Missing spaces before opening bracket and after commas. I recommend
running indent on new files.
> 
>
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/grub-devel
>   


-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel


[PATCH][UPDATED] support for xz compression format

2010-02-04 Thread Szymon Janc
Hello,

New version of xz compression patch.
Changes since then:
- it works now :-)
- decoder dictionary can be enlarge up to DICT_BIT_SIZE defined in xz_lzma2.h
 (currently set to 30 = 128MiB)
- xz uses crc32 from libgcrypt-grub, internal crc implementation removed
- removed linux kernel related code from xz
- get rid of some not needed memcpy() calls
- xzio.mod stuff made according to Vladimir's comments
- simplified wraper (only 1 file - xz_wrap.h)
- number of other improvments and tons of bugfixes
- files licence changed from public domain to gpl3+

This patch also changes gnulib-wrap.h true/false definitions to make them truly
constants.

TODO and other questions:
- lack of file seek support, if files is compressed with small block size, it 
is 
  possible to implement pseudo-random access
- what is optimal i/o buffer size? like BUFSIZ macro in glibc
- default dictionary size should be chosen for files compression
- still need to do performance tests
- introduce some common layer for xzio/gzio etc ?

Comments are welcome

-- 
Szymon K. Janc
szy...@janc.net.pl // GG: 1383435


xz.diff.gz
Description: GNU Zip compressed data
___
Grub-devel mailing list
Grub-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/grub-devel