A MapFile.Reader will automatically detect and decompress without needing to be told anything special. You needn't have to worry about decompressing files by yourself in Apache Hadoop generally - the framework handles it for you transparently if you're using the proper APIs.
On Sun, Aug 11, 2013 at 8:49 PM, Abhijit Sarkar <[email protected]> wrote: > Thanks Harsh. However, if I compress the MapFile using the MapFile.Writer > Constructor option and then put it in a DistributedCache, how do I > uncompress it in the Map/Reduce? There isn't any API method to do that > apparently. > > Regards, > Abhijit > >> From: [email protected] >> Date: Sun, 11 Aug 2013 12:56:43 +0530 >> Subject: Re: How to compress MapFile programmatically >> To: [email protected] > >> >> A MapFile isn't a directory. It is a directory _containing_ two files. >> You cannot "open" a directory for reading. >> >> The MapFile API is documented at >> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/io/MapFile.html >> and thats what you're to be using for reading/writing them. >> >> Compression is a simple option you need to provide when invoking the >> writer: >> http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/io/MapFile.Writer.html#MapFile.Writer(org.apache.hadoop.conf.Configuration,%20org.apache.hadoop.fs.FileSystem,%20java.lang.String,%20org.apache.hadoop.io.WritableComparator,%20java.lang.Class,%20org.apache.hadoop.io.SequenceFile.CompressionType,%20org.apache.hadoop.io.compress.CompressionCodec,%20org.apache.hadoop.util.Progressable) >> >> On Sun, Aug 11, 2013 at 1:46 AM, Abhijit Sarkar >> <[email protected]> wrote: >> > Hi, >> > I'm a Hadoop newbie. This is my first question to this mailing list, >> > hoping >> > for a good start :) >> > >> > MapFile is a directory so when I try to open an InputStream to it, it >> > fails >> > with FileNotFoundException. How do I compress MapFile programmatically? >> > >> > Code snippet: >> > final FileSystem fs = FileSystem.get(conf); >> > final InputStream inputStream = fs.open(new Path(uncompressedStr)); >> > >> > Exception: >> > java.io.FileNotFoundException: /some/directory (No such file or >> > directory) >> > at java.io.FileInputStream.open(Native Method) >> > at java.io.FileInputStream.<init>(FileInputStream.java:120) >> > at >> > >> > org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.<init>(RawLocalFileSystem.java:71) >> > at >> > >> > org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:107) >> > at >> > org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:177) >> > at >> > >> > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:126) >> > at >> > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283) >> > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) >> > at >> > name.abhijitsarkar.learning.hadoop.io.IOUtils.compress(IOUtils.java:104) >> > >> > Regards, >> > Abhijit >> >> >> >> -- >> Harsh J -- Harsh J
