Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-03-07 Thread Fridrich Strba
Hi, all,

On Thu, 2011-03-03 at 09:41 +, Michael Meeks wrote: 
>   Perhaps one thing you could do - would be to help dung out the
> instsetoo_native/util dmake file, and check the tooling - such that we
> can build several install sets in parallel - that would help
> particularly wrt. help-packs etc. where we have 100 or so installer
> files to package concurrently - and we have ~7 CPUs sitting idle while
> it happens. Any chance of that ? until we fix that, I'd prefer to have
> this only on master really - debugging the packaging phase is rather a
> nightmare as it is without an extra two hour wait each time to see if it
> succeeded :-)

I was finally able to set a tinderbox doing a release configuration
(modulo binfilter which I could not get to build for the while).

So, the binfilter-free all lang installer is about 180 MB which means
that it is more then 60 MB win compared to what we had before. So, it
will be under 200 MB even when I manage to make the binfilter build. I
think that there it starts to be reasonable to simply not package one
build of 104 languages and another of 56, but distribute just the big
one, which will be anyway smaller then the "little" one we have now.

Once again, I am impressed by this one.

Cheers

F.


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-03-03 Thread Michael Meeks
Hi Kami,

On Thu, 2011-03-03 at 00:14 +0100, Kálmán „KAMI” Szalai wrote:
> Compression time (in the tinderbox) was:
> runtime: 1087.37 (minutes)
> now:
> runtime: 1232.32 (minutes)

So it takes another 2:30 or so extra ? sounds like quite a lot.

Perhaps one thing you could do - would be to help dung out the
instsetoo_native/util dmake file, and check the tooling - such that we
can build several install sets in parallel - that would help
particularly wrt. help-packs etc. where we have 100 or so installer
files to package concurrently - and we have ~7 CPUs sitting idle while
it happens. Any chance of that ? until we fix that, I'd prefer to have
this only on master really - debugging the packaging phase is rather a
nightmare as it is without an extra two hour wait each time to see if it
succeeded :-)

> Is it okay for us? Or it is too big penalty? I asked Fridrich to make
> available the latest tb Windows build for download. I would like to
> check the installer size and performance.

Sure; it'd be great to validate the win.

Thanks,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-03-02 Thread Kálmán „KAMI” Szalai
Compression time (in the tinderbox) was:
runtime: 1087.37 (minutes)
now:
runtime: 1232.32 (minutes)

Is it okay for us? Or it is too big penalty? I asked Fridrich to make
available the latest tb Windows build for download. I would like to
check the installer size and performance.

KAMI
-- 

Best regards,

Kálmán „KAMI” Szalai | 神 | kami911 [at] gmail [dot] com


My favorite projects:

OxygenOffice Professional  - office suite - for everybody 
| Magyarul  - In Hungarian

Blog  | Support  

Follow me , if you can 



signature.asc
Description: OpenPGP digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-03-01 Thread Kálmán „KAMI” Szalai
Hi Michael,

2011-02-28 13:16 keltezéssel, Michael Meeks írta:
> Hi Kami,
>
>   Wow - I'm so sorry, I missed your sexy mail for a week; that sucks -
> over-much merging on another branch :-) Tor has been on FTO, and
> Fridrich with his head down releasing 3.3.1 (and the Novell LibreOffice
> product) - which in part explains the lack of response to you (and
> Steven) - which sucks - sorry guys.
>
> On Tue, 2011-02-22 at 17:32 +0100, Kálmán „KAMI” Szalai wrote:
>> I ran few test to figure out what is the best method for installset
>   These look great, nice spreadsheet.
Thanks, I just wanted to follow your helpful spreeadsheet :o).
>> So I went to other direction, whatif I increase the efficiency of LZM
>> compression of makecab. I found that we can use .Set CompressionMemory=
>> 21 setting. This setting produces 83,91% of original installer size and
>   Oh - nice :-) that solves the install-space problem as well.
\o/
>> In the gray section I tried a special way when I uncompressed every
>> zip container (ODF, JAR, ZIP, etc) in the installset and every file
>> contains only stored data without compression. In this way I was able
>> to gain more 15 MB, but this require zip recompressing at the end of
>> installation process that may make it to complex and time consuming.
>> Please check the attached document, and if you want to go with it
>> apply the attached patches.
>   Oh ! that sounds fun :-) 15Mb for that. Actually on master - we could
> use flat ODF files and get the same result (for some on-disk size
> growth) without having to re-compress, and they're dead fast in master.
>
>   So - personally, I find it amazing that ZIP is better than LZMA - ever,
> it is such an older compression algorithm, and this flies in the face of
> everything I've seen in practise with it [ eg. we use LZMA for our RPMs
> in openSUSE these days AFAIR, better than bzip2, which in turn was
> better than .tgz (essentially zip) ].
I think the story is a bit different. Recompressing the already
compressed data is a simply waste of time and horsepower. It is always
better to compress uncompressed data. If you comapre zip and LZMA on the
whole uncompressed installer I am sure LZMA will provide much better
compression ratio.
>   It'd be great to commit the first patch but can we instead hack:
>
>   solenv/bin/modules/installer/globals.pm to patch:
>
> $cabfilecompressionlevel = 7;
>
>   rather than hard-coding that and leaving the variable there.
>
>   The second - can you really confirm that LZMA is worse than zip ?
As I described below, I think as always recompressing compressed files
may have penalty in term of compression ration.
>   Anyhow - great work :-)
I will provide a patch for it, thanks.
>   Michael.
>
KAMI



signature.asc
Description: OpenPGP digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-03-01 Thread Michael Meeks
Hi Wols,

On Tue, 2011-03-01 at 00:21 +, Wols Lists wrote:
> I checked on wikipedia after I posted and Huffman is actually SIXTY
> years old! Set as a class project, and published in 1952.

I tried to gently correct your thesis that Huffman is -the- optimal
compression algorithm for all things; it may be for a specific
symbol-by-symbol case, but in real-world use cases it can clearly be
improved on. In my (contrived) case the "save as a perl script" gives a
massive compression advantage, QED.

> So zip should be tricky to beat for compression if it actually uses
> Huffman, although I would expect it to take ages.

Again - wikipedia states so, my experience says so - why don't you
think it does ? :-)

ATB,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-28 Thread Wols Lists
On 28/02/11 21:51, Michael Meeks wrote:
>> Why the surprise? How long has Huffman been around - 30 years? 40? And
>> > you CANNOT do better than Huffman (there are other equally as good
>> > algorithms, though).

>   Ho hum; glad to know Huffman compression is optimal ;-) (personally I'd
> go for arithmetic compression with a good "guess a libreoffice" model
> alogorithm to get it down to a handful of bits ;-). Notice - that ZIP is
> in fact two layered compression algorithm: Lempel Ziv, and then Huffman,
> it is not clear that this makes it obviously non-optimal but it is
> certainly somewhat weaker in that regard.

Interesting ...

I checked on wikipedia after I posted and Huffman is actually SIXTY
years old! Set as a class project, and published in 1952.

So zip should be tricky to beat for compression if it actually uses
Huffman, although I would expect it to take ages.

Your figures imply either (a) zip doesn't do Huffman, or (b) bzip2 and
lzma use a much larger input block size.

Was the wink to say you knew it was optimal already? I thought that was
common knowledge :-) although reading wikipedia it reminded me that
there is a pathological case - lots of identical elements. Given that I
gather Windows files contain large sections of nulls, Huffman won't like
that! That's probably the main time you *do* want double-compression -
some form of RLE will suppress any pathological input into Huffman.

The main reason I suggested adding times to KAMI's table was that it
makes explicit the time/compression tradeoff.

Cheers,
Wol
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-28 Thread Michael Meeks
Hi Wols,

On Mon, 2011-02-28 at 18:04 +, Wols Lists wrote:
> > So - personally, I find it amazing that ZIP is better than LZMA - ever,
> > it is such an older compression algorithm, and this flies in the face of
> > everything I've seen in practise with it [ eg. we use LZMA for our RPMs
> > in openSUSE these days AFAIR, better than bzip2, which in turn was
> > better than .tgz (essentially zip) ].
>  
> Why the surprise? How long has Huffman been around - 30 years? 40? And
> you CANNOT do better than Huffman (there are other equally as good
> algorithms, though).

Ho hum; glad to know Huffman compression is optimal ;-) (personally I'd
go for arithmetic compression with a good "guess a libreoffice" model
alogorithm to get it down to a handful of bits ;-). Notice - that ZIP is
in fact two layered compression algorithm: Lempel Ziv, and then Huffman,
it is not clear that this makes it obviously non-optimal but it is
certainly somewhat weaker in that regard.

#!/usr/bin/perl -w
for (my $i = 0; $i < 1; $i++) { print "$i\n"; }

typesize
plain   48k
zip 23k
bzip2   11k
lzma3k

That is not a very representative input (of course ;-), but perhaps not
so far away from the rampant duplication we face. I would (personally)
expect to see lzma do better than plain deflate. Of course, you can
easily imagine a compression algorithm, that turns the file into just
those two lines of perl above ;-)

> Having tried to back up a hard disk over ethernet (dd if=/dev/sda
> of=//samba/archive.hdi), it was direly slow over a 10Mb card. So I tried
> to pipe it through bzip2 to speed up the transfer - it was *even*
> *slower* !

bzip2 is not fast; that is certainly so, OTOH it was far better space
wise than the deflate it replaced; this is why much source code is still
packaged as .bz2 - if you want to bend your mind, read up on the
Burrows-Wheeler transform [ FWIW ;-].

>  Why do you think LZMA is better than zip - because it's faster? 

More efficient. Slowness of compression we can cope with easily, as
long as decompression is fast: and for most algorithms this balance is
easyish to achieve.

> (I think we need to add compression time to KAMI's nice table :-)

As long as it is within five or so minutes of CPU time on a fast
machine, and/or preferably is multi-thread-able [ though our choice of
tools and platforms strongly limits what we can use at the cab level
anyway ] - compression time is fairly irrelevant.

ATB,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-28 Thread Wols Lists
On 28/02/11 12:16, Michael Meeks wrote:
> Hi Kami,
> 
>   Wow - I'm so sorry, I missed your sexy mail for a week; that sucks -
> over-much merging on another branch :-) Tor has been on FTO, and
> Fridrich with his head down releasing 3.3.1 (and the Novell LibreOffice
> product) - which in part explains the lack of response to you (and
> Steven) - which sucks - sorry guys.

Finally found a few minutes to try and catch up - so jumping in late
myself ...
> 
> On Tue, 2011-02-22 at 17:32 +0100, Kálmán „KAMI” Szalai wrote:
>> I ran few test to figure out what is the best method for installset
> 
>   These look great, nice spreadsheet.
> 
>> So I went to other direction, whatif I increase the efficiency of LZM
>> compression of makecab. I found that we can use .Set CompressionMemory=
>> 21 setting. This setting produces 83,91% of original installer size and
> 
>   Oh - nice :-) that solves the install-space problem as well.

If I understand correctly, imho this is the best approach too.

Note that it's a bad idea to try to compress an already-compressed file.
Depending on the relative efficiency of the two algorithms, it's very
easy for the second compression to actually *increase* the file size. So
compressing the cabs, and just packing them into the download with no
compression makes most theoretical sense.

Plus, in return for minimal or negative compression, the second
compression will also take a LOT of time.
> 
>> In the gray section I tried a special way when I uncompressed every
>> zip container (ODF, JAR, ZIP, etc) in the installset and every file
>> contains only stored data without compression. In this way I was able
>> to gain more 15 MB, but this require zip recompressing at the end of
>> installation process that may make it to complex and time consuming.
>> Please check the attached document, and if you want to go with it
>> apply the attached patches.
> 
>   Oh ! that sounds fun :-) 15Mb for that. Actually on master - we could
> use flat ODF files and get the same result (for some on-disk size
> growth) without having to re-compress, and they're dead fast in master.
> 
>   So - personally, I find it amazing that ZIP is better than LZMA - ever,
> it is such an older compression algorithm, and this flies in the face of
> everything I've seen in practise with it [ eg. we use LZMA for our RPMs
> in openSUSE these days AFAIR, better than bzip2, which in turn was
> better than .tgz (essentially zip) ].
> 
Why the surprise? How long has Huffman been around - 30 years? 40? And
you CANNOT do better than Huffman (there are other equally as good
algorithms, though). The holy grail is an algorithm that is as efficient
as Huffman, very quick to decompress, and fairly quick to compress.

Having tried to back up a hard disk over ethernet (dd if=/dev/sda
of=//samba/archive.hdi), it was direly slow over a 10Mb card. So I tried
to pipe it through bzip2 to speed up the transfer - it was *even*
*slower*! Why do you think LZMA is better than zip - because it's
faster? Or because it's more efficient? I don't know much about the
details of either, but I'd be surprised if zip *wasn't* capable of very
good compression. It's just that as you crank up the compression
efficiency, you also crank up the time taken to do the compression -
witness bz2's inability to flood my 10Mbit network card.

(I think we need to add compression time to KAMI's nice table :-)

Cheers,
Wol
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-28 Thread Michael Meeks
Hi Kami,

Wow - I'm so sorry, I missed your sexy mail for a week; that sucks -
over-much merging on another branch :-) Tor has been on FTO, and
Fridrich with his head down releasing 3.3.1 (and the Novell LibreOffice
product) - which in part explains the lack of response to you (and
Steven) - which sucks - sorry guys.

On Tue, 2011-02-22 at 17:32 +0100, Kálmán „KAMI” Szalai wrote:
> I ran few test to figure out what is the best method for installset

These look great, nice spreadsheet.

> So I went to other direction, whatif I increase the efficiency of LZM
> compression of makecab. I found that we can use .Set CompressionMemory=
> 21 setting. This setting produces 83,91% of original installer size and

Oh - nice :-) that solves the install-space problem as well.

> In the gray section I tried a special way when I uncompressed every
> zip container (ODF, JAR, ZIP, etc) in the installset and every file
> contains only stored data without compression. In this way I was able
> to gain more 15 MB, but this require zip recompressing at the end of
> installation process that may make it to complex and time consuming.
> Please check the attached document, and if you want to go with it
> apply the attached patches.

Oh ! that sounds fun :-) 15Mb for that. Actually on master - we could
use flat ODF files and get the same result (for some on-disk size
growth) without having to re-compress, and they're dead fast in master.

So - personally, I find it amazing that ZIP is better than LZMA - ever,
it is such an older compression algorithm, and this flies in the face of
everything I've seen in practise with it [ eg. we use LZMA for our RPMs
in openSUSE these days AFAIR, better than bzip2, which in turn was
better than .tgz (essentially zip) ].

It'd be great to commit the first patch but can we instead hack:

solenv/bin/modules/installer/globals.pm to patch:

$cabfilecompressionlevel = 7;

rather than hard-coding that and leaving the variable there.

The second - can you really confirm that LZMA is worse than zip ?

Anyhow - great work :-)

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-22 Thread Kálmán „KAMI” Szalai
Hi Michael and Friends of LibreOffice,

I ran few test to figure out what is the best method for installset
compressing (se the attached document and diagram). I cheated a bit,
because I used the tools (7z, zip, cabmake) and I didn't modified build
environment. So I downloaded LibO_3.3.1rc2_Win_x86_install_multi.exe and
extracted it. I found that unpacked cab file is the best if we repack it
with LZMA but it is unbalanced because the preinstalled installation kit
is more than 700MB.
So I went to other direction, whatif I increase the efficiency of LZM
compression of makecab. I found that we can use .Set CompressionMemory=
21 setting. This setting produces 83,91% of original installer size and
if we combine it with a simple zip compression the download size can
reduce to 83,54%. This scenario is represented by blue color and the
gain was 36 MB! I think the compression time is not much higher and the
installation time is not increased critically. The preinstallation time
decreased so I think this is an interesting opportunity to reduce the
windows installer's size. What is your opinion? Of course I tried
several other method, but this compilation of compression algorithms
produces the best overall appearance. In the gray section I tried a
special way when I uncompressed every zip container (ODF, JAR, ZIP, etc)
in the installset and every file contains only stored data without
compression. In this way I was able to gain more 15 MB, but this require
zip recompressing at the end of installation process that may make it to
complex and time consuming. Please check the attached document, and if
you want to go with it apply the attached patches.

All the best,
KAMI

2011-02-21 10:42 keltezéssel, Michael Meeks írta:
> Hi Kalman,
>
> On Sun, 2011-02-20 at 15:15 +0100, Kálmán „KAMI” Szalai wrote:
>> I am sure we can shrink the installer more with better compression
>   Hah - so, of course Fridrich and Tor have looked into this - and
> naturally you are right :-) there is a lot we can do. Clearly
> compressing things badly first, and then well later (eg. zip, then lzma)
> doesn't give the best results. That is particularly so when there are
> lots of similarities in the eg. un-compressed ODF files we have
> internally for eg. templates but minor differences that will have
> hard-to-compress knock-on-effects in the compressed stream.
>
>   The ideal would be to use only one level of compression - using the
> best algorithm (LZMA) ie. NSIS, and nothing in the .cab file (which is
> limited to various lame algorithms, and perhaps per-contained-file
> compression, rather than per-whole-cab-file).
>
>   Unfortunately, with the currently level of eg. template duplication,
> this would give us a vast .cab file that would chew lots of space on the
> target machine - though it might shrink our download nicely :-)
>
>> What is your opinion? Can somebody test it on a real Windows build
>> system? 
>   The current balance is based on testing; quite possibly there is a
> better set of compression options - but we need to work out what it is.
> Our current settings are optimised for a multi-lang install as we ship
> it. To change that I'd like to see a table:
>
> New options   Download/kb Install/kb
>
>   with the relevant sizes of both of these guys. That is just a matter of
> running lots of long builds and comparing the output I guess, and I
> suspect we will get something like a trade-off between these two values.
>
>   Clearly, the 'real' solution is engineering to stop us having so much
> pointless duplication ;-)
>
>   HTH,
>
>   Michael.
>


LibO - Compression tests.ods
Description: application/vnd.oasis.opendocument.spreadsheet
From 068adb8f3900f8612092b6cbfd8c2ea8c2403988 Mon Sep 17 00:00:00 2001
From: Kalman Szalai - KAMI 
Date: Tue, 22 Feb 2011 17:28:37 +0100
Subject: [PATCH 92/92] Windows installer compression optimization

---
 solenv/bin/modules/installer/windows/msiglobal.pm |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/solenv/bin/modules/installer/windows/msiglobal.pm b/solenv/bin/modules/installer/windows/msiglobal.pm
index 735c6e5..19ce8f3 100644
--- a/solenv/bin/modules/installer/windows/msiglobal.pm
+++ b/solenv/bin/modules/installer/windows/msiglobal.pm
@@ -64,7 +64,7 @@ sub write_ddf_file_header
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Compress=ON\n";
 push(@{$ddffileref} ,$oneline);
-$oneline = ".Set CompressionLevel=$installer::globals::cabfilecompressionlevel\n";
+$oneline = ".Set CompressionMemory= 21\n";
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Cabinet=ON\n";
 push(@{$ddffileref} ,$oneline);
-- 
1.7.1

From 5a7dd93332ac29e4c37dead92d7fd5bc5e31adbc Mon Sep 17 00:00:00 2001
From: Kalman Szalai - KAMI 
Date: Tue, 22 Feb 2011 17:21:53 +0100
Subject: [PATCH] Windows installer compression optimization

---
 .../source/win32/nsis/downloadtemplate.nsi |5 +++--
 1 files changed, 3 

Re: [Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-21 Thread Michael Meeks
Hi Kalman,

On Sun, 2011-02-20 at 15:15 +0100, Kálmán „KAMI” Szalai wrote:
> I am sure we can shrink the installer more with better compression

Hah - so, of course Fridrich and Tor have looked into this - and
naturally you are right :-) there is a lot we can do. Clearly
compressing things badly first, and then well later (eg. zip, then lzma)
doesn't give the best results. That is particularly so when there are
lots of similarities in the eg. un-compressed ODF files we have
internally for eg. templates but minor differences that will have
hard-to-compress knock-on-effects in the compressed stream.

The ideal would be to use only one level of compression - using the
best algorithm (LZMA) ie. NSIS, and nothing in the .cab file (which is
limited to various lame algorithms, and perhaps per-contained-file
compression, rather than per-whole-cab-file).

Unfortunately, with the currently level of eg. template duplication,
this would give us a vast .cab file that would chew lots of space on the
target machine - though it might shrink our download nicely :-)

> What is your opinion? Can somebody test it on a real Windows build
> system? 

The current balance is based on testing; quite possibly there is a
better set of compression options - but we need to work out what it is.
Our current settings are optimised for a multi-lang install as we ship
it. To change that I'd like to see a table:

New options Download/kb Install/kb

with the relevant sizes of both of these guys. That is just a matter of
running lots of long builds and comparing the output I guess, and I
suspect we will get something like a trade-off between these two values.

Clearly, the 'real' solution is engineering to stop us having so much
pointless duplication ;-)

HTH,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-20 Thread Kálmán „KAMI” Szalai
Hi,

After my last mail I created the required patches (3) for already built
LibO. A diffed against 3-3. Please test is if possible.

Please check the attached patches.
-- 

Best regards,

Kálmán „KAMI” Szalai | 神 | kami911 [at] gmail [dot] com


My favorite projects:

OxygenOffice Professional  - office suite - for everybody 
| Magyarul  - In Hungarian

Blog  | Support  

Follow me , if you can 

--- solenv/bin/modules/installer/windows/msiglobal.pm
+++ solenv/bin/modules/installer/windows/msiglobal.pm
@@ -64,7 +64,7 @@ sub write_ddf_file_header
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Compress=ON\n";
 push(@{$ddffileref} ,$oneline);
-$oneline = ".Set CompressionLevel=$installer::globals::cabfilecompressionlevel\n";
+$oneline = ".Set CompressionMemory= 21\n";
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Cabinet=ON\n";
 push(@{$ddffileref} ,$oneline);
--- setup_native/source/win32/nsis/downloadtemplate.nsi
+++ setup_native/source/win32/nsis/downloadtemplate.nsi
@@ -3,7 +3,8 @@
 !define PRODUCT_PUBLISHER "PUBLISHERPLACEHOLDER"
 !define PRODUCT_WEB_SITE "WEBSITEPLACEHOLDER"
 
-SetCompressor lzma
+SetCompress Off
+SetDatablockOptimize On
 ; SetCompressor zlib
 ; Helpful for debugging, disable for products
 ; RequestExecutionLevel user

--- solenv/bin/modules/installer/windows/msiglobal.pm
+++ solenv/bin/modules/installer/windows/msiglobal.pm
@@ -64,7 +64,7 @@ sub write_ddf_file_header
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Compress=ON\n";
 push(@{$ddffileref} ,$oneline);
-$oneline = ".Set CompressionLevel=$installer::globals::cabfilecompressionlevel\n";
+$oneline = ".Set CompressionMemory= 21\n";
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Cabinet=ON\n";
 push(@{$ddffileref} ,$oneline);
--- setup_native/source/win32/nsis/downloadtemplate.nsi
+++ setup_native/source/win32/nsis/downloadtemplate.nsi
@@ -3,7 +3,9 @@
 !define PRODUCT_PUBLISHER "PUBLISHERPLACEHOLDER"
 !define PRODUCT_WEB_SITE "WEBSITEPLACEHOLDER"
 
-SetCompressor lzma
+SetCompress Force
+SetCompressor zlib
+SetDatablockOptimize On
 ; SetCompressor zlib
 ; Helpful for debugging, disable for products
 ; RequestExecutionLevel user
--- solenv/bin/modules/installer/windows/msiglobal.pm
+++ solenv/bin/modules/installer/windows/msiglobal.pm
@@ -60,11 +60,7 @@ sub write_ddf_file_header
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set MaxDiskSize=2147483648\n";# This allows the .cab file to get a size of 2 GB.
 push(@{$ddffileref} ,$oneline);
-$oneline = ".Set CompressionType=LZX\n";
-push(@{$ddffileref} ,$oneline);
-$oneline = ".Set Compress=ON\n";
-push(@{$ddffileref} ,$oneline);
-$oneline = ".Set CompressionLevel=$installer::globals::cabfilecompressionlevel\n";
+$oneline = ".Set Compress=OFF\n";
 push(@{$ddffileref} ,$oneline);
 $oneline = ".Set Cabinet=ON\n";
 push(@{$ddffileref} ,$oneline);
--- setup_native/source/win32/nsis/downloadtemplate.nsi
+++ setup_native/source/win32/nsis/downloadtemplate.nsi
@@ -3,7 +3,10 @@
 !define PRODUCT_PUBLISHER "PUBLISHERPLACEHOLDER"
 !define PRODUCT_WEB_SITE "WEBSITEPLACEHOLDER"
 
-SetCompressor lzma
+SetCompress Auto
+SetCompressor /SOLID lzma
+SetCompressorDictSize 64
+SetDatablockOptimize On
 ; SetCompressor zlib
 ; Helpful for debugging, disable for products
 ; RequestExecutionLevel user


signature.asc
Description: OpenPGP digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[Libreoffice] Windows installer size reduction effort - the compression theory

2011-02-20 Thread Kálmán „KAMI” Szalai
Hi,

I am sure we can shrink the installer more with better compression
settings. Unfortunately I am not able to try these theory now because I
have problem with Windows based build system. So any participation
related to this topic is welcome

The current situation:
We are using double compression for creating LibreOffice install kit for
Windows:

1) Compressed CAB file contains the installable files
2) The installation system is packed into the NSIS based preinstaller.

According to
solenv/bin/modules/installer/globals.pm
solenv/bin/modules/installer/windows/msiglobal.pm
setup_native/source/win32/nsis/downloadtemplate.nsi

For cab file compression we are using:
globals.pm
cabfilecompressionlevel = 7

and

msiglobal.pm
.Set CompressionType=LZX
.Set Compress=ON
.Set CompressionLevel=cabfilecompressionlevel=7
options.

For NSIS based presintaller we are using:
SetCompressor lzma
option.

So currently we use a high compression for cab file generation which
produces small file but takes more time. Then recompress it with lzma
that may not able to compress so much but takes long time. However we
use NSIS in SetCompress auto mode that might causes compression test and
if it is not enough good it drops the compressed file and use the
original cab file.

My idea is to use only one tight compressor and the other archiver
should used in store or zip mode. So there is to way to rethink the
compression practice:

1) Maximize the compression of CAB files and minimalize NSIS compression:
This produces a smaller cab file but if the installer request files
in non other sequence as compressed may we pay penalty in uncompression
time.

For this we can use CAB compression settings:
.Set CompressionType=LZX
.Set Compress=ON
.Set CompressionMemory= 21

(We can eliminate .Set CompressionLevel=cabfilecompressionlevel=7 maybe
it is useful for MSZIP compression (or not))

For NSIS we use simple store or small compression:

a)
SetCompress Off
SetDatablockOptimize On

b)
SetCompress Force
SetCompressor zlib
SetDatablockOptimize On

OR

2) CAB file only store the installer's files and NSIS compression set to
high:
This may produces the smallest installer so I would try this for
first. The compression of NSIS takes time but decompression speed is and
also the installation is fast. Thepreinstallation may takes
dictionary equivalent memory usage so we should not set Dictionary size
larger than 32-64 MB. The only negative effect, the preinstalled
(unpacked) installer uses same diskspace as the fully installed package
(For installation we need at least twice space of full installation).

For this we can use CAB store setting:
.Set Compress=OFF

For NSIS we use simple store or small compression:

SetCompress Auto
SetCompressor /SOLID lzma
SetCompressorDictSize 64
SetDatablockOptimize On


References:

http://msdn.microsoft.com/en-us/library/bb417343.aspx

http://www.symantec.com/connect/articles/advanced-nsis-scripting-part-1

"The SetCompressorDictSize accepts one setting, which is the number of
megabytes to use for the dictionary size. The greater this number, the
more memory is required to execute the installation. The memory
requirement to run the installer will be the dictionary size plus about
4 megabytes or so. The default dictionary size is 8 megabytes. You can
get a slightly better compression ratio by changing this to 16, 32, or
64. You can set this up to 128 or possibly higher, but you won't notice
much difference when you use a dictionary size greater than 64.

With that in mind, I suggest that you use "SetCompressorDictSize 64",
but you can experiment with this value to see what works best for your
particular installer."

http://nsis.sourceforge.net/Docs/Chapter4.html#4.8.2.3

What is your opinion? Can somebody test it on a real Windows build system?

-- 

Best regards,

Kálmán „KAMI” Szalai | 神 | kami911 [at] gmail [dot] com


My favorite projects:

OxygenOffice Professional  - office suite - for everybody 
| Magyarul  - In Hungarian

Blog  | Support  

Follow me , if you can 



signature.asc
Description: OpenPGP digital signature
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice