Re: Solving the compression dilema when rsync-ing Debian versions

2001-01-10 Thread Andrew Lenharth
  No, this won't work with very many compression algorithms.  Most
  algorithms update their dictionaries/probability tables dynamically based
  on input.  There isn't just one static table that could be used for
  another file, since the table is automatically updated after every (or
  near every) transmitted or decoded symbol.  Further, the algorithms start
  with blank tables on both ends (compression and decompression), the
  algorithm doesn't transmit the tables (which can be quite large for higher
  order statistical models).
  
 Well the table is perfectly static when the compression ends. Even if
 the table isn't transmitted itself, its information is contained in the
 compressed file, otherwise the file couldn't be decompressed either. 

But the tables you have at the end of the compression are NOT what you
want to use for the entire process.  The point of a dynamic table is to
allow the probabilities of different symbols to change dynamically as the
compression happens.  The tables used by the end of the file may be very
different than those used early in the file, to the point where they are
useless for the early part of the file.

Without fear of sounding redundent, EACH symbol is encoded with a
different set of tables.  That is the probability tables or dictionaries
are different for EACH and EVERY character of the file.  And as I said
before, the tables from latter in the compression (which you propose
using all the time) will not even generate the same compressed file as the
one they are based on nor will they be anywhere near optimal for the file.
That is, gzip --compress-like=foo.gz foo would generate a entirely
different foo.gz than gzip foo would.

I really suggets you investigate LZW based algorithms.  You would find
they do not behave as you think.  Only incredibly simple static
compression algorithms have the properties you desire.

Andrew Lenharth




Re: Solving the compression dilema when rsync-ing Debian versions

2001-01-08 Thread Andrew Lenharth
 No, I want rsync not even to be mentioned. All I want is something
 similar to
 
 gzip --compress-like=old-foo foo
 
 where foo will be compressed as old-foo was or as aquivalent as
 possible. Gzip does not need to know anything about foo except how it
 was compressed. The switch --compress-like could be added to any
 compression algorithmus (bzip?) as long as it's easy to retrieve the
 compression scheme. Besides the following is completly legal but
 probably not very sensible

No, this won't work with very many compression algorithms.  Most
algorithms update their dictionaries/probability tables dynamically based
on input.  There isn't just one static table that could be used for
another file, since the table is automatically updated after every (or
near every) transmitted or decoded symbol.  Further, the algorithms start
with blank tables on both ends (compression and decompression), the
algorithm doesn't transmit the tables (which can be quite large for higher
order statistical models).

I suggest you read about LZW and arithmetic encoding with higher order
stitistical models.  Try The Data Compression Book by Nelson (?) for a
fairly good overview of how these work.

What is better and easier is to ensure that the compression is
deturministic (gzip by default is not, bzip2 seems to be), so that rsync
can decompress, rsync, compress, and get the exact file back on the other
side.

Andrew Lenharth




Re: Bochs / VGA-Bios license question / freebios anyone?

2000-08-14 Thread Andrew Lenharth
I originally ITPed bochs.  Unfortunately it would have to go in non-free.
the VGA-BIOS included is licensed only for use and distribution with
bochs.  It therefor cannot be seperated into a seperate package from
bochs (and if bochs is packaged, it should ge removed from the source
archive.

Andrew Lenharth

Remember, never ask a geek why;
   just nod your head and back away slowly... 

--

Given infinite time, 100 monkeys could type out the complete works of
Shakespeare.
Win 98 source code? Eight monkeys, five minutes.

--

On 14 Aug 2000, Peter Makholm wrote:

 Adrian Bunk [EMAIL PROTECTED] writes:
 
  That sounds as if non-free is the right place. Or the best is you make two
  packages: A vga-bios package in non-free and a bochs package in contrib.
 
 But why would this be better that a complete package in contrib. Is
 bochs useable without the vga-bios or is the vga-bios useable for
 other things than bochs?
 
 I was about to package bochs a couple of months ago but this was one
 of the things stopping me.
 
 
 
 And then it neither booted my QNX, Eros or Plan9 boot disks.
 
 -- 
 Peter
 
 
 -- 
 To UNSUBSCRIBE, email to [EMAIL PROTECTED]
 with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
 




how about a real unstable?

2000-03-28 Thread Andrew Lenharth
I know others have expressed this, but a big reason we wind up with slower
release cycles is we have a stable unstable.  i.e. unstable is rather
stable.  Most of the other distributions start with the software that will
be released by the time they release and start working with it early.

What I really mean: unstable should (as soon as work on potato is
finished), have the new perl, xfree, apache, kernel, etc.  Even if they
are still release canidates.  the sooner we have everything working with
the new packages, the sooner we can release.  For example, to wait till
perl 5.6 is out to try to integrate it could take longer that to start the
integration process with a perl release canidate.

It is the unstable branch, lets take advantage of it and make it unstable 
to start out with.  The sooner we can find problems and fix them, the
shorter our release cycles will be, and the more upto-date our main
packages will be.

Andrew Lenharth

Remember, never ask a geek why;
   just nod your head and back away slowly... 

--

Given infinite time, 100 monkeys could type out the complete works of
Shakespeare.
Win 98 source code? Eight monkeys, five minutes.

--