Currently I can retrieve entries if I use MapFileOutputFormat via conf.setOutputFormat with no compression specified. But I was trying to do this:
public void configure(JobConf jobConf) { ... this.writer = new MapFile.Writer(jobConf, fileSys, dirName, Text.class, Text.class, SequenceFile.CompressionType.BLOCK); ... } public void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter) throws IOException { ... writer.append(newkey,newvalue); ... } To use SequenceFile block compression. Then later trying to retrieve the output values in a separate class: public static void main(String[] args) throws Exception { ... conf.setInputFormat(org.apache.hadoop.mapred.SequenceFileInputFormat.class); ... MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, inDataPath, defaults); Partitioner part = (Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf); Text entryValue = null; entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new Text("mykey"), new Text()); if (entryValue != null) { System.out.println("My Entry's Value: "); System.out.println(entryValue.toString()); } for (MapFile.Reader reader:readers) { if (reader != null) { reader.close(); } } } But when I use block compression I no longer get a result from MapFileOutputFormat.getEntry. What am I doing wrong? And/or is there a way for this to work using conf.setOutputFormat(MapFileOutputFormat.class) and conf.setMapOutputCompressionType(SequenceFile.CompressionType.BLOCK)?