Re: MD: ATRAC loseless compression techniques...

2001-01-12 Thread Stainless Steel Rat


* Anthony Lalande [EMAIL PROTECTED]  on Thu, 11 Jan 2001
| Does any loseless compression algorithm require the entire set of data for
| read access before it begins compression?

No.  In fact none do.  Conventional compression algorithms operate on
fixed-size blocks of data.  Real-time compression of an audio stream is
easilly possible with a bit of buffering.  The issue is not that but
compressing fast enough so that the buffer is not overrun.
-- 
Rat [EMAIL PROTECTED]\ When not in use, Happy Fun Ball should be
Minion of Nathan - Nathan says Hi! \ returned to its special container and
PGP Key: at a key server near you!  \ kept under refrigeration.

-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]



Re: MD: ATRAC loseless compression techniques...

2001-01-12 Thread Anthony Lalande


 No.  In fact none do.  Conventional compression algorithms operate on
 fixed-size blocks of data.  Real-time compression of an audio stream is
 easilly possible with a bit of buffering.  The issue is not that but
 compressing fast enough so that the buffer is not overrun.

Well, in effect, the answer is yes. It does require a whole set of data
before compression, but to combat this, the data is split into blocks, and
each block is compressed individually from a buffer.

I'm wondering if you would get better compression by treating the whole
stream as 1 block, and then compressing that, or compression in many smaller
blocks. I guess it all depends on the compression used.

-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]



Re: MD: ATRAC loseless compression techniques...

2001-01-12 Thread Stainless Steel Rat


* Anthony Lalande [EMAIL PROTECTED]  on Fri, 12 Jan 2001
| I'm wondering if you would get better compression by treating the whole
| stream as 1 block, and then compressing that, or compression in many smaller
| blocks. I guess it all depends on the compression used.

All data compression programs work roughly like this: a block of data is
put into a buffer.  The algorithm scans the buffer for redundant data and
patterns and builds hash tables in additional buffers.  Scanning often
requires several passes over the buffer.  Many compression utilities use
several different algorithms to obtain maximum compression, and each
algorithm requires one or more hash buffers of its own.  And, of course,
each pass requires processing power.  A final hash for the block is chosen
and written out.

Increasing the block size increases the quantity of redundant data and
patterns in the stream, which usually means greater compression ratios.
Bigger blocks require more memory for the buffers, and have more complex
patterns which means more processing power/time is required.

32-64K blocks is the norm for high-level compression these days.  That is
what bzip2 uses, and boy is it slow even on a fast Pentium-III.  One minute
of linear PCM is ~8.75MB.  You would need a supercomputer the size of a
refrigerator to utilize a block size that large.
-- 
Rat [EMAIL PROTECTED]\ Happy Fun Ball contains a liquid core,
Minion of Nathan - Nathan says Hi! \ which, if exposed due to rupture, should
PGP Key: at a key server near you!  \ not be touched, inhaled, or looked at.

-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]



Re: MD: ATRAC loseless compression techniques...

2001-01-12 Thread Anthony Lalande


 32-64K blocks is the norm for high-level compression these days.  That is
 what bzip2 uses, and boy is it slow even on a fast Pentium-III.  One minute
 of linear PCM is ~8.75MB.  You would need a supercomputer the size of a
 refrigerator to utilize a block size that large.

Well, I can go to sleep tonight feeling that much smarter. Large
pattern-matches and combination-matches are the promise of quantum
supercomputers, but that's another forum altogether.

-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]



MD: ATRAC loseless compression techniques...

2001-01-11 Thread Anthony Lalande


 Lossless compression is what people generally call programs like WinZip.
 When you compress a file with WinZip, it takes up less space and when you
 decompress it you get the exact same data that you compressed.  In other
 words, it doesn't lose any data in the compression and decompression
 process.

Right. Your analogy to WinZip gets me wonderin'...

Does any loseless compression algorithm require the entire set of data for
read access before it begins compression? If you wanted to encode audio with
a loseless compression, could you do it in real-time or would you need to
wait until the entire recording is complete, and then compress afterwards?
Would the results be as good in real-time than as a post-process?

 This is, coincidentally, why audio MD equipment would be very poor for
 data storage.  I believe this has been discussed on-list a few times.

...and if I understand correctly, data would have to be encoded into some
sort of audio stream designed to be completely loseless when converted with
ATRAC, right? ...or maybe embed some sort of error-correction mechanism...?

-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]



Re: MD: ATRAC loseless compression techniques...

2001-01-11 Thread Dave Kimmel


 Does any loseless compression algorithm require the entire set of data for
 read access before it begins compression? If you wanted to encode audio with
 a loseless compression, could you do it in real-time or would you need to
 wait until the entire recording is complete, and then compress afterwards?
 Would the results be as good in real-time than as a post-process?

I'm sure that there are some algorithms that require access to the entire
data set before they could compress it.  I can't name any off the top of
my head though.

Compress, gzip, and bzip2 (all from the Unix world, although Windows
implementations exist) are able to compress a stream of data in real time
(this is actually their normal way of handling things).  These all use
lossless compression algorithms with varying degrees of speed and
compression.

One advantage of a non-realtime compression algorithm is that it can be
much more complex than a realtime one.  If a realtime algorithm is too
complex it won't be able to keep up with the input data and will lose
data.  This isn't an issue with non-realtime compression since it doesn't
have to keep up with an input, it can work through at its leisure.

As for quality of the results, that's dependant on the specific algorithms
in question.
 
  This is, coincidentally, why audio MD equipment would be very poor for
  data storage.  I believe this has been discussed on-list a few times.
 
 ...and if I understand correctly, data would have to be encoded into some
 sort of audio stream designed to be completely loseless when converted with
 ATRAC, right? ...or maybe embed some sort of error-correction mechanism...?

As I said, these have been hashed out on the list before.  The gist of it
all seemed to be that you would fit a very small amount of data onto a
disc, it would take a long time (74 minutes for a full disc) to read and
write the data.

This isn't to say that it can't be done, just that it would be fairly
impractical when compared to using something designed for storing data.

-- Dave Kimmel
   [EMAIL PROTECTED] 
   ICQ: 5615049 



-
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]