subject:"CompressionCodec in MapReduce"

Re: CompressionCodec in MapReduce

2012-04-11 Thread Arun C Murthy

You can write your own InputFormat (IF) which extends FileInputFormat. In your IF you get the InputSplit which has the filename during the call to getRecordReader. That is the hook you are looking for. More details here: http://hadoop.apache.org/common/docs/r1.0.2/mapred_tutorial.html#Job+Input

Re: CompressionCodec in MapReduce

2012-04-11 Thread Zizon Qiu

It is possible but a little tricky. As I mention before,write a custom InputFormat and the associate RecordReader. On Wed, Apr 11, 2012 at 5:23 PM, Grzegorz Gunia wrote: > I think we misunderstood here. > > I'll base my question upon an example: > Lets say I want each of the files stored on my

Re: CompressionCodec in MapReduce

2012-04-11 Thread Grzegorz Gunia

I think we misunderstood here. I'll base my question upon an example: Lets say I want each of the files stored on my hdfs to be encrypted prior to being physically stored on the cluster. For that I'll write a custom CompressionCodec, that performs the encryption, and use it during any edits/cre

Re: CompressionCodec in MapReduce

2012-04-11 Thread Zizon Qiu

If your are: 1. using TextInputFormat. 2.all input files are ends with certain suffix like ".gz" 3.the custom CompressionCodec already register in configuration and getDefaultExtension return the same suffix like as describe in 2. the nothing else you need to do. hadoop will deal with it automati

RE: CompressionCodec in MapReduce

2012-04-11 Thread Devaraj k

...@student.agh.edu.pl] Sent: Wednesday, April 11, 2012 1:46 PM To: mapreduce-user@hadoop.apache.org Subject: Re: CompressionCodec in MapReduce Thanks for you reply! That clears some thing up There is but one problem... My CompressionCodec has to be instantiated on a per-file basis, meaning it needs

Re: CompressionCodec in MapReduce

2012-04-11 Thread Grzegorz Gunia

Thanks for you reply! That clears some thing up There is but one problem... My CompressionCodec has to be instantiated on a per-file basis, meaning it needs to know the name of the file it is to compress/decompress. I'm guessing that would not be possible with the current implementation? Or i

Re: CompressionCodec in MapReduce

2012-04-11 Thread Zizon Qiu

append your custom codec full class name in "io.compression.codecs" either in mapred-site.xml or in the configuration object pass to Job constructor. the map reduce framework will try to guess the compress algorithm using the input files suffix. if any CompressionCodec.getDefaultExtension() regis

CompressionCodec in MapReduce

2012-04-11 Thread Grzegorz Gunia

Hello, I am trying to apply a custom CompressionCodec to work with MapReduce jobs, but I haven't found a way to inject it during the reading of input data, or during the write of the job results. Am I missing something, or is there no support for compressed files in the filesystem? I am well

Re: CompressionCodec in MapReduce

Re: CompressionCodec in MapReduce

Re: CompressionCodec in MapReduce

Re: CompressionCodec in MapReduce

RE: CompressionCodec in MapReduce

Re: CompressionCodec in MapReduce

Re: CompressionCodec in MapReduce

CompressionCodec in MapReduce

8 matches

Site Navigation

Mail list logo

Footer information