In addition:

-          "mapred.output.compression.type" is now replaced with
"mapred.map.output.compression.type"

-          the old implementation of the Java interface
setMapOutputCompressorClass() used to turn on map compression on
automatically as side-effect, the 0.15 one doesn't. Looks like one has
to call setCompressMapOutput() separately.

 

Aargh.

 

________________________________

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Joydeep Sen
Sarma
Sent: Wednesday, February 20, 2008 5:06 PM
To: core-user@hadoop.apache.org
Subject: changes to compression interfaces in 0.15?

 

Hi developers,

 

In migrating to 0.15 - i am noticing that the compression interfaces
have changed:

 

-          compression type for sequencefile outputs used to be set by:
SequenceFile.setCompressionType()

-          now it seems to be set using:
sequenceFileOutputFormat.setOutputCompressionType()

 

The change is for the better - but would it be possible to:

 

-          remove old/dead interfaces. That would have been a
straightforward hint for applications to look for new interfaces.
(hadoop-default.xml also still has setting for old conf variable:
io.seqfile.compression.type)

-          if possible - document changed interfaces in the release
notes (there's no way we can find this out by looking at the long list
of Jiras).

 

As u can imagine - this causes a very subtle and harmful regression in
behavior of existing apps. It does not causes failures - and in our case
- switched from BLOCK to RECORD compression - meaning - there's no
compression at all pretty much. I caught this by *pure* chance and now I
am living in absolute fear of what else lurks out there.

 

i am not sure how updated the wiki is on the compression stuff (my
responsibility to update it) - but please do consider the impact of
changing interfaces on existing applications. (maybe we should have a
JIRA tag to mark out bugs that change interfaces).

 

As always - thanks for all the fish (err .. working code),

 

Joydeep

 

Reply via email to